Genetic and Pathogenic Differences Between nivale and Microdochium majus

by

Linda Elizabeth Jewell

A Thesis Presented to The University of Guelph

In partial fulfilment of requirements for the degree of Doctor of Philosophy in Environmental Science

Guelph, Ontario, Canada

© Linda Elizabeth Jewell, December, 2013 ABSTRACT

GENETIC AND PATHOGENIC DIFFERENCES BETWEEN

MICRODOCHIUM NIVALE AND MICRODOCHIUM MAJUS

Linda Elizabeth Jewell Advisor: University of Guelph, 2013 Professor Tom Hsiang

Microdochium nivale and M. majus are fungal plant pathogens that cause cool-temperature diseases on grasses and cereals. Nucleotide sequences of four genetic regions were compared between isolates of M. nivale and M. majus from Triticum aestivum () collected in North

America and Europe and for isolates of M. nivale from turfgrasses from both continents. Draft genome sequences were assembled for two isolates of M. majus and two of M. nivale from wheat and one from turfgrass. Dendograms constructed from these data resolved isolates of M. majus into separate clades by geographic origin. Among M. nivale, isolates were instead resolved by host plant . Amplification of repetitive regions of DNA from M. nivale isolates collected from two proximate locations across three years grouped isolates by year, rather than by location.

The mating-type (MAT1) and associated flanking genes of Microdochium were identified using the genome sequencing data to investigate the potential for these pathogens to produce ascospores. In all of the Microdochium genomes, and in all isolates assessed by PCR, only the

MAT1-2-1 gene was identified. However, unpaired, single--derived colonies of M. majus produced fertile perithecia in the lab. This finding contrasts with the cannonical requirements for sexual production among the . To further explore this, MAT1 and flanking gene sequences were identified in the genome sequences of six additional species from Xylariaceae, no homologs of known MAT1-1-1 genes were detected, suggesting that the control of sexual reproduction among the Xylariaceae may be differentally regulated relative to other Sordariomycete species. Detached of T. aestivum and (Kentucky bluegrass) were inoculated with either M. nivale or M. majus and were incubated at either 23 ºC or at 4 ºC to investigate the processes of these pathogens. Despite reported field host preferences, the two pathogens were equally virulent on both host plants at the temperatures investigated. The results presented here reveal genetic, but not pathogenic, differences between

M. nivale and M. majus and further demonstrate that sub-populations may exist within the groups of these pathogens on different host plants.

ACKNOWLEDGEMENTS

Firstly, and most importantly, I would like to thank my advisor Dr. Tom Hsiang for accepting me into his lab and inviting me to participate in this exciting and multifaceted research project. I have sincerely appreciated the opportunity that he has given me to learn a very diverse set of skills. I am also very appreciative of the opportunities that he has given me to participate in engaging and inspiring conferences throughout my time as a student. I would also like to thank all of the members of my thesis, qualifying examination, and defense committees for their patience, guidance, and helpful suggestions at every stage of my project. Thanks are also extended to the administrative and support staff at the University of Guelph, especially in the

School of Environmental Sciences, for helping to make my time here at the U of G run smoothly.

I would also like to thank the numerous collaborators who provided me with some of the samples or materials that contributed to these analyses.

I thank my friends and my family, especially my parents Bernice and Calvin and my brother Chris for their love and support through all of the long years that I have been in school.

Thank you to all of my labmates, both past and current, for their friendship and help. I would especially like to thank Amy Shi for her help with research, for her amazing skills as a tour guide, and for her cat-wrangling abilities. Thank you Anne-Miet, for being a morning person; to

Mihaela, for support and friendship; to Sarah, for being a vegetarian and a travel buddy; to

Vince, for his help with bioinformatics; to Brady, for his help with RNA; to unbelievable undergrads Holly, Sara, and Craig, for their assistance in the lab and their and friendship; and to everyone else I have had the pleasure of working with, for their kindness and assistance.

Finally, because everyone would be horrified if I left them out, thank you Jim and Luke for your skills as alarm clocks and as supervisors of all fridge-related endeavours. iv

TABLE OF CONTENTS

ACKNOWLEDGEMENTS ...... iv

TABLE OF CONTENTS ...... v

LIST OF TABLES ...... xi

LIST OF FIGURES ...... xvi

LIST OF APPENDICES ...... xxiii

LIST OF ABBREVIATIONS AND ACRONYMS ...... xxvi

Chapter 1 General Introduction & Literature Review ...... 1

1.1 Introduction ...... 1

1.2 General information about Microdochuim nivale and M. majus ...... 2

1.2.1 Disease Cycle ...... 4

1.2.2 Phylogenetic classification of M. nivale and M. majus ...... 5

1.3 Differences between M. majus and M. nivale ...... 6

1.3.1 Morphological characteristics ...... 7

1.3.2 Pathogenic differences ...... 7

1.3.3 Genetic differences ...... 8

1.4 Sexual Reproduction ...... 9

1.4.1 Sexual reproduction in the Ascomycota ...... 9

1.4.2 Sexual reproduction in Microdochium nivale and M. majus ...... 11

1.5 Sequencing techniques and bioinformatics ...... 12

1.5.1 DNA sequencing techniques ...... 12

1.5.2 RNA sequencing ...... 14

1.6 Hypotheses and Objectives ...... 16 v

1.7 References for Chapter 1 ...... 18

Chapter 2 Phylogenetic Analyses ...... 25

2.1 Introduction ...... 25

2.1.1 Fungal and nomenclature ...... 25

2.1.2 Molecular phylogeny ...... 26

2.1.3 Genes used for molecular phylogeny in fungi ...... 27

2.1.4 Tree-building algorithms ...... 28

2.1.5 Genetic differences between and within M. nivale and M. majus ...... 30

2.1.6 Objectives ...... 32

2.2 Materials and Methods ...... 33

2.2.1 Sample Collection ...... 33

2.2.2 DNA extraction ...... 34

2.2.3 Primer design and selection...... 35

2.2.4 PCR protocols and sequencing ...... 36

2.2.5 Sequence alignments and trees ...... 37

2.2.6 Restriction Digests ...... 38

2.3 Results ...... 38

2.3.1 Primer testing and design ...... 38

2.3.2 Sequence differences between Microdochium nivale and M. majus ...... 40

2.3.3 Geographic and host-specific differences ...... 42

2.4 Discussion ...... 43

2.5 References for Chapter 2 ...... 49

Appendices for Chapter 2 ...... 65 vi

Chapter 3 Comparative Genomics ...... 107

3.1 Introduction ...... 107

3.1.1 General overview of whole-genome analyses ...... 107

3.1.2. Sequencing platforms...... 108

3.1.3 Genome assembly and protein prediction ...... 111

3.1.4 Whole-genome comparisons ...... 114

3.1.5 Objectives ...... 118

3.2 Materials and Methods ...... 118

3.2.1 DNA extraction, quantification, and sequencing ...... 118

3.2.2 Genome assembly and gene prediction...... 119

3.2.3 Whole-genome comparisons and identification of unique genes ...... 121

3.2.4 Design of species-specific primers ...... 122

3.2.5 Identification of putative pathogen-related genes ...... 123

3.2.6 Identification of putative transposable elements...... 124

3.3 Results ...... 124

3.3.1 Genome sequencing, assembly, and protein prediction ...... 124

3.3.2 Whole-genome comparisons ...... 125

3.3.3 Development of species-specific primers ...... 127

3.3.4 Identification of pathogenesis-related genes ...... 128

3.3.5 Identification of putative transposable elements and comparison to PHI genes . 129

3.3.6 Identification of a putative EF-1α sequence in the genome of M. bolleyi ...... 130

3.4 Discussion ...... 131

3.5 References for Chapter 3 ...... 139 vii

Appendices for Chapter 3 ...... 166

Chapter 4 Population Genetics...... 235

4.1 Introduction ...... 235

4.1.1 Genetic diversity ...... 235

4.1.2 Mechanisms of generating genetic diversity ...... 236

4.1.3 Repetitive DNA sequences and genetic diversity ...... 238

4.1.4 Genetic diversity in M. nivale ...... 240

4.1.5 Objectives ...... 242

4.2 Materials and Methods ...... 242

4.2.1 Sample collection and DNA extraction ...... 242

4.2.2 SSR primer design ...... 243

4.2.3 ISSR and SSR PCR protocols ...... 243

4.2.4 Result scoring and analysis ...... 244

4.3 Results ...... 246

4.3.1 ISSR primer screening ...... 246

4.3.2 SSR primer design and screening ...... 247

4.3.3 Linkage disequilibrium calculations ...... 248

4.3.4 Year-to-year and location-to-location comparisons ...... 249

4.4 Discussion ...... 250

4.5 References for Chapter 4 ...... 256

Appendices for Chapter 4 ...... 268

Chapter 5 Mating Type Experiments ...... 275

5.1 Introduction ...... 275 viii

5.1.1 Reproduction in Fungi ...... 275

5.1.2 Genes associated with sexual reproduction ...... 278

5.1.3 Sexual reproduction in M. nivale and M. majus ...... 280

5.1.4 Objectives ...... 282

5.2 Materials and Methods ...... 282

5.2.1 Test of mating type primers based on conserved sequences ...... 282

5.2.2 Identification of putative MAT1 loci and flanking genes and screening of isolate

collection ...... 283

5.2.3 Mating experiments ...... 286

5.2.4 Comparison of Microdochium sp. with other species ...... 287

5.3 Results ...... 288

5.3.1 Test of published mating type primers and redesigned universal primers ...... 288

5.3.2 Identification of putative mating type and flanking genes...... 289

5.3.3 Mating experiments ...... 293

5.3.4 Comparison of the Microdochium MAT1 locus to that of other species ...... 297

5.4 Discussion ...... 298

5.5 References for Chapter 5 ...... 312

Appendices for Chapter 5 ...... 344

Chapter 6 Infection Process of Microdochium majus and M. nivale ...... 388

6.1 Introduction ...... 388

6.1.1 The disease cycle ...... 388

6.1.2 The infection processes of M. nivale and M. majus ...... 389

6.1.3 Host specificity and infection success ...... 391 ix

6.1.4 Objectives ...... 391

6.2 Materials and Methods ...... 392

6.2.1 Plant culture ...... 392

6.2.2 Inoculum preparation ...... 392

6.2.3 Inoculation ...... 393

6.2.4 Sample collection, staining, and scoring ...... 394

6.2.5 Statistical methods ...... 395

6.3 Results ...... 396

6.3.1 Experiment A ...... 396

6.3.2 Experiment B ...... 397

6.3.3 Experiment C ...... 398

6.3.4 Experiment D ...... 399

6.3.5 Experiment E ...... 400

6.4 Discussion ...... 401

6.5 References for Chapter 6 ...... 407

Appendices for Chapter 6 ...... 426

Chapter 7 General Discussion and Conclusions ...... 427

7.1 Major conclusions ...... 427

7.2 General Discussion and Conclusions ...... 428

7.3 References for Chapter 7 ...... 438

x

LIST OF TABLES

Table 2.1 Isolates of Microdochium nivale and M. majus, including their geographic origin and host-plant origin used for nucleotide sequence analysis...... 53

Table 2.2 Primers used in PCR and sequencing reactions...... 56

Table 2.3 List of RPB2 sequences used to design primers for M. nivale and M. majus

(, ) with taxonomic information and GenBank accession numbers ... 57

Table 2.4 List of species used to design β-tubulin primers for M. nivale and M. majus with

GenBank accession numbers. Other than the Microdochium species, all species included are members of the Xylariaceae...... 59

Table 3.1 Summary of DNA quantity and sequencing facility utilized for genome sequencing of six Microdochium isolates...... 145

Table 3.2 Sordariomycetes genomes included in whole- genome comparisons against

Microdochium spp...... 146

Table 3.3 Summary of genome assembly, protein prediction, and predicted gene annotation statistics for sequenced Microdochium genomes ...... 147

Table 3.4 Genome assembly statistics for M. majus assembly with Velvet, SOAPdenovo, and

ABySS for odd-numbered kmers 29-59. Note that gap-closing was not performed for this comparison...... 148

Table 3.5 Species-specific primers designed and tested with at least two isolates each of M. nivale and M. majus ...... 149

Table 3.6 Comparisons between the predicted gene sequences from M. nivale, M. majus, and M. bolleyi against each other and against the genomes of 6 (non-Microdochium) members of the

xi

Xylariales and 9 non-Xylariales members of the Sordariomycetes. Comparisons were performed using tBLASTn with an e-value cutoff of 1e-05...... 150

Table 3.7 Comparisons between the predicted gene sequences from M. nivale, M. majus, and M. bolleyi against each other and against the genomes of 6 (non-Microdochium) members of the

Xylariales and 9 non-Xylariales members of the Sordariomycetes. Comparisons were performed using tBLASTn with an e-value cutoff of 1e-20...... 152

Table 3.8 Comparisons between the predicted gene sequences from M. nivale, M. majus, and M. bolleyi against each other and against the genomes of 6 (non-Microdochium) members of the

Xylariales and 9 non-Xylariales members of the Sordariomycetes. Comparisons were performed using tBLASTn with an e-value cutoff of 1e-50...... 154

Table 3.9 Fungal pathogen-host interaction (PHI) genes with highly variable copy numbers among Microdochium predicted gene sets that may play a role in pathogenicity or resistance (Mb = ; Mm = M. majus; Mn = M. nivale) ...... 156

Table 3.10 Accession numbers for ten randomly-selected putative unique predicted gene sequences identified in Microdochium spp...... 157

Table 3.11Transposable element sequences downloaded from GenBank that were used in comparisons against the genomes of Microdochium spp...... 158

Table 3.12 Summary of putative transposable element sequences identified in the Microdochium spp. genomes and their relative proximity to putative pathogen-host interaction genes ...... 160

Table 4.1 Year and location of collection from the Guelph Turfgrass Institute for all samples included in multi-year screening. See Figure 4.3 for a map depicting these locations...... 260

Table 4.2 List of all SSR and ISSR primers screened to assess genetic variation in Microdochium nivale field isolates collected across three years...... 261 xii

Table 4.3 List of all SSR and ISSR primers selected for analysis of genetic variation in

Microdochium nivale field isolates collected across three years...... 262

Table 4.4 Results of linkage disequilibrium calculations performed using the program Disequil

(described in Mahuku et al. 1998) on isolates of Microdochium nivale collected in three separate years and in two locations at the Guelph Turfgrass Institute. See Figure 4.3 for a map depicting these locations...... 263

Table 5.1 Published mating-type primers tested with M. nivale and M. majus ...... 316

Table 5.2 Isolates of M. nivale and M. majus tested with published mating-type primers...... 317

Table 5.3 List of species and GenBank accession numbers for genes used to design conserved

MAT1-1-1 and MAT1-2-1 primers in Microdochium sp...... 318

Table 5.4 Primers designed to amplify mating-type (MAT1-1 and MAT1-2) and flanking genes in Microdochium spp...... 320

Table 5.5 Number of bands amplified by the ISSR and SSR PCR primers listed with a selction of

Microdochium majus isolates including the single-ascospore-derived (AS) cultures derived from the parent isolate 99049...... 321

Table 5.6 List of species and GenBank accession numbers for genes used to search for putative

MAT1-1-1, MAT1-2-1, and flanking genes in Microdochium genomes by standalone tBLASTn

...... 322

Table 5.7 List of predicted genes corresponding to the mating-type (MAT1-2-1) and flaking genes (cytoskeletal assembly protein SLA2 and DNA lyase APN2) in the Microdochium genomes studied. Comparisons were performed using standalone tBLASTn to query the gene of interest against the Microdochium spp. genomes listed...... 323

xiii

Table 5.8 Results of MAT1-2-1 amplification of M. nivale isolates with the primers

Mn_MAT2_20F and Mn_MAT2_727R ...... 324

Table 5.9 Isolates used in mating-type crosses ...... 325

Table 5.10 Summary of perithecial production in second experiment according to temperature of incubation and species included in each cross. Isolates were inoculated on sterilized wheat straw overlaid on water agar, and observations were performed after two months of incubation...... 326

Table 5.11 Species used for MAT-region synteny investigation. All species were members of the order Xylariales and family Xylariaceae ...... 327

Table 5.12 BLASTx results for putative matches to flanking and mating genes observed in

Xylariales genomes. BLAST searches were performed by querying the putative flanking genes

(cytoskeletal assembly protein SLA2, DNA lyase APN2, anaphase-promoting complex protein

APC5, and cytochrome oxidase COX) and mating type gene (MAT2) against the GenBank non- redundant database...... 328

Table 6.1 Summary of conditions tested in infection process experiments performed for M. nivale and M. majus inoculated on P. pratensis and T. aestivum...... 409

Table 6.2 Isolates of M. nivale and M. majus used in infection process experiments on T. aestivum and P. pratensis...... 410

Table 6.3 Sample collection timepoints for infection process studies of M. nivale and M. majus on T. aestivum and P. pratensis...... 411

Table 6.4 Comparisons between mean number of penetration observations per unit area for each isolate of Microdochium nivale (Mn) and M. majus (Mm) on detached leaves of Kentucky bluegrass (K) and wheat (W) in experiment D. Data for each isolate at each time point were

xiv

compared using a two-sided Wilcoxon rank sum approximation. Means followed by different letters were significantly different at p = 0.05...... 412

Table 6.5 Comparisons between mean number of penetration observations per unit area for all isolates, regardless of identity, from each host type on Kentucky bluegrass and wheat in experiment D. Data for each host type at each time point were compared using a two-sided

Wilcoxon rank sum approximation. Means followed by different letters were significantly different at p = 0.05...... 413

Table 6.6 Comparisons between mean number of penetration observations per unit area for each isolate of Microdochium nivale (Mn) or M. majus (Mm) on detached leaves of Kentucky bluegrass (K) and wheat (W) in experiment E. Data for each isolate at each time point were compared using a two-sided Wilcoxon rank sum approximation. Means followed by different letters were significantly different at p = 0.05...... 414

Table 6.7 Comparisons between mean number of incidences of penetration per unit area for all isolates from each host type on detached leaves of Kentucky bluegrass and wheat in experiment

E. Data for each host type at each time point were compared using a two-sided Wilcoxon rank sum approximation. Means followed by different letters were significantly different at p = 0.05.

...... 415

xv

LIST OF FIGURES

Figure 1.1 Pink mould damage on mixed creeping bentgrass / annual bluegrass green

( / ), Guelph, ON, 02 March 2012...... 24

Figure 2.1 Number of research journal papers published between 2000-2012 including the terms

"maximum likelihood", "maximum parsimony", and either "neighbor joining" or "neighbour joining" obtained by searching the google scholar database using the term phylog* and either

"maximum likelihood", "maximum parsimony" or "'neighbour joining' OR 'neighbor joining'" for articles published between 2000 and 2012...... 60

Figure 2.2 Gel image for European (lanes A-F) and North American (lanes G-I, K-O) isolates of

Microdochium nivale amplified with the Y13NF and Y13NR primers, which target a genetic region of unknown function. All six of the European isolates (lanes A-F) were strongly amplified, whereas weak bands are present in only two (lanes H, I) of the North American samples. A negative control, containing water instead of DNA, was included in the reaction (lane

Q). Select sizes (bp) from a 100 bp increment ladder (lane J) are indicated to the right...... 61

Figure 2.3 Bootstrapped maximum likelihood trees for RPB2, β-tubulin, EF-1α, and ITS. The tips of the tree are labelled with isolate number, species (M or N for M. majus or M. nivale, respectively), host origin (W or T for wheat or turfgrass), and geographic origin (NA or EU for

North America or Europe). Bootstrap values out of 100 are located on the respective branches.

Nodes with less than 50% support were collapsed. Separate clades resolving the sister species and host-specific or geographic groupings are indicated by curly braces...... 63

Figure 2.4 Gel image for HindIII digest of RPB2 amplicon of M. nivale (lanes A-C) and M. majus (lanes D-F). Partial RPB2 sequences were amplified with the primers RPB_150F and fRPB2-7cR.The amplicons of M. nivale isolates were digested at two locations to produce xvi

fragments that were 125, 444, and 475 bp in length and the M. majus amplicons were digested at only one location, producing fragments that were 444 and 596 bp in length. Note that the 444 and 475 bp bands were not well-relsoved. A non-digested RPB2 amplicon (1,040 bp in length) from M. majus is included for comparison (lane G). Select sizes (bp) from a 100 bp increment ladder (lane J) are indicated to the right...... 64

Figure 3.1 Pipeline for DNA sequencing by Illumina-Solexa technology. Genomic DNA (1) is isolated and (2) sheared into millions of fragments. Adapter sequences are ligated onto both sides of all fragments (3). The fragments are introduced to the solid surface (4), which contains sequences that are complementary to the adapters. In the PCR stage (5), the second adapter sequence of each fragment (white) anneals to its complement. The fragment is amplified by

DNA polymerase to produce a complementary strand (6). In each round of PCR, both strands are duplicated (7). The result is clusters of identical complementary sequences (8) for each of the original fragments. In the sequencing stage, a solution containing all four dNTPs, each labelled with a different terminating fluorophore, is introduced (9). Only one nucleotide at a time can be introduced (10) due to the terminator sequence. The incorporated nucleotide is identified by the fluorophore. The fluorophore terminator is cleaved (11) to allow for the addition of the next nucleotide. The labelled dNTP solution is re-applied (12), the second nucleotide is incorporated

(13), and step 11 is repeated. This process is repeated until the full fragment has been sequenced

(14) (Mardis 2008)...... 162

Figure 3.2 Summary of paired-end sequencing. Genomic DNA is sheared into fragments of approximately the same length (e.g. approximately 500 bp). This fragment is then sequenced from both the 5' and the 3' ends, generating two "paired" fragments (labelled A and B and joined by a dashed arc). The number of sequencing cycles performed is equal to the lengths of these xvii

fragments (e.g. for strands consisting of two paired-end fragments of 100 bp each, a total of 200 cycles would be performed). During the assembly process, the expected distance between these two fragments (in this example, 300 bp) is used to facilitate their association with the other fragments generated during sequencing...... 163

Figure 3.3 Schematic representation of a de Brujin graph (Zerbino and Birney 2008). Nodes are represented by boxes. Each node consists of a short alignment of overlapping sequences of the same length. Each node also has a sister node consisting of the reverse complement of the sequences and the alignment in its sister, located immediately above or below the node (e.g. A and A' are sister nodes). Nodes are connected based on their sequence similarities: for example, the final sequence in node A shares four of its six nucleotides with the first sequence in node B.

Node B is connected both to node A and to nodes C and D, because the final sequence in node B overlaps equally well with the first sequences in both nodes C and D. The use of de Brujin graphs results in the association of short, overlapping alignments that are used to assemble the sequencing reads into larger contigs or scaffolds...... 164

Figure 3.4 a) Neighbour-joining, b) maximum likelihood, and c) maximum parsimony trees depicting the relationships between the sequenced Microdochium genomes, based on the concatenated sequences of ten genes that were putatively unique to Microdochium. Bootstrap values (out of 100) are displayed on each node. The ten genes are listed in Table 3.10. Scale bars represent either 0.1 nucleotide change per base (a and b) or 100 substitutions (c)...... 165

Figure 4.1 Diagram depicting the relative positions of hypothetical SSR and ISSR loci and the primers that could amplify these regions. The SSR loci (white text on black background) are flanked by ISSR regions (black text on white background). Whereas the ISSR primer (white text on black arrow) is anchored within the repetitive SSR region, and thus can be designed with only xviii

knowledge about the repetitive SSR sequence, the SSR primers (black text on white arrows) are located within the non-repetitive ISSR sequences, and thus require more detailed genomic information...... 264

Figure 4.2 Grasses displaying symptoms of pink snow mold and / or patch. Both photos were taken at the GTI, Guelph, Ontario, 2 March 2012. a) Kentucky bluegrass in the area to the east of the native green and b) annual bluegrass / creeping bentgrass mixture, native green.

...... 265

Figure 4.3 Map of the Guelph Turfgrass Institute with collection locations and key landmarks indicated: A: Near pathology green (PG); B: roadside (Rd). The grass species at both collection locations was Poa pratensis (Kentucky bluegrass)...... 266

Figure 4.4 UPGMA tree depicting relationships between M. nivale isolates collected from P. pratensis at two locations (Figure 4.3) yearly from 2011-2013. Bootstrap values (out of 100) are displayed on key nodes. Legend: 1: 2013 pathology green isolates; 2: 2013 roadside; 3: 2012 pathology green; 4: 2012 roadside; 5: 2011 pathology green; 6: 2011 roadside...... 267

Figure 5.1 Orientation of the MAT1 region and flanking genes in the Sordariomycete species

Neurospora crassa (Butler et al. 2004), Giberella zeae (Yun et al. 2000), Botrytis cinerea, and

Sclerotinia sclerotorium (Amselem et al. 2001). Diagram is not to scale. A vertical bar extending over the MAT1 locus indicates that this species is heterothallic and that the gene(s) located on the parallel bars are interchangeable in the two mating types...... 338

Figure 5.2 Orientation and synteny of the putative MAT1 region and the flanking genes APC5,

SLA2, APN2, and COX13 in several species of Xylariales, including Microdochium sp.

Diagrams are not to scale. A double slash (//) indicates a long (>10 kb) distance between putative

xix

genes found on the same scaffold, and a vertical bar (|) indicates that the gene(s) that follow was

/ were found on a different scaffold...... 339

Figure 5.3 Single segment of wheat straw inoculated with M. majus isolate 99049 and M. nivale isolate 99077 incubated on water agar at 20 °C for approximately two months. Note the production of perithecia on the side closest to M. majus (A) but not M. nivale (B). 10x magnification...... 340

Figure 5.4 Perithecium of M. majus isolate 99049 at (A) 40 x and (B) 100x magnification.

Perithecia depicted were observed after two months of incubation on wheat straw on water agar at 20 ºC ...... 341

Figure 5.5 Ascospore (centre) produced by M. majus isolate 99049 at 400x magnification Spore depicted were observed after two months of incubation on wheat straw on water agar at 20 ºC.

...... 342

Figure 5.6 Bootstrapped UPGMA tree depicting the relationships between ten single-ascospore cultures derived from M. majus isolate 99049 relative to their parent culture and DNA from seven other M. majus isolates, including one isolate collected from the same location on the same date as the parent culture (99061), two isolates from Europe (10098 and 10099), and four cultures collected from the same wheat field on the same date (12043-12046). The horizontal bar represents 10% sequence divergence. Bootstrap values are out of 100...... 343

Figure 6.1 The disease cycle describing the events that occur during a host-pathogen interaction

(modified from (Agrios 2005)) ...... 416

Figure 6.2 Number of detached segments of P. pratensis (A) and T. aestivum (B) where penetration was observed at time of collection for leaf blades treated with hyphal inoculum in experiment A. Three leaf blades were collected at each time point...... 417 xx

Figure 6.3 Penetration of stomata of detached leaves of T. aestevum by hyphae of M. majus isolate 99049 (stained blue) incubated on moistened filter paper and incubated at 22 ºC. Photo taken at 3 dpi, 400 x magnification...... 418

Figure 6.4 Hyphae of M. majus isolate 99061 emerging from stomata of detached leaves of wheat (circled) incubated on moistened filter paper and incubated at 22 ºC. Photo taken at 400x magnification, 5 dpi...... 419

Figure 6.5 Number of detached leaf segments of P. pratensis (A) and T. aestivum (B) incubated on moist filter paper where penetration was observed at time of collection for leaf blades treated with hyphal inoculum in experiment B. Three leaf blades were collected at each time point. .. 420

Figure 6.6 Number of detached leaf segments of P. pratensis (A) and T. aestivum (B) incubated on moist filter paper where penetration was observed at time of collection for leaf blades treated with conidial inoculum in experiment B. Three leaf blades were collected at each time point. 421

Figure 6.7 Number of detached leaf segments of P. pratensis (A) and T. aestivum (B) incubated on moist filter paper where penetration was observed at time of collection for leaf blades treated with hyphal inoculum in experiment C...... 422

Figure 6.8 Number of detached leaf segments of P. pratensis (A) and T. aestivum (B) incubated on moist filter paper where penetration was observed at time of collection for leaf blades treated with conidial inoculum in experiment C...... 423

Figure 6.9 Number of incidences of penetration per unit area on P. pratensis (A) and T. aestivum

(B) incubated on moist filter paper at time of collection for leaf blades treated with hyphal inoculum in experiment D...... 424

xxi

Figure 6.10 Number of incidences of penetration per unit area on P. pratensis (A) and T. aestivum (B) incubated on moist filter paper at time of collection for leaf blades treated with hyphal inoculum in experiment E...... 425

xxii

LIST OF APPENDICES

Appendix 2.1 Alignemnt of RPB2 sequences from Sordariomycete species. Primer-binding sites are indicated by shading...... 65

Appendix 2.2 Alignment of β-tubulin sequences from Sordariomycete species. Priming sites are indicated by shading...... 76

Appendix 2.3 RPB2 alignment of M. nivale and M. majus sequences with HindIII digest sites indicated by shading...... 91

Appendix 2.4 Alignment of β-tubulin sequences from M. nivale and M. majus. Primer-binding sites are indicated by shading...... 96

Appendix 2.5 Alignment of EF-1α sequences from M. nivale and M. majus. Primer binding sitse are indicated by shading...... 100

Appendix 2.6 Alignment of ITS sequences from M. nivale and M. majus. Primer binding sites and RsaI restriction sites indicated by shading...... 103

Appendix 3.1 Sample script to execute SOAPdenovo...... 166

Appendix 3.2 Sample configuration file for SOAPdenovo ...... 167

Appendix 3.3 Sample script used to execute ABySS ...... 168

Appendix 3.4 Sample script used to execute Velvet...... 169

Appendix 3.5 Sample script used to execute AUGUSTUS ...... 170

Appendix 3.6 Annotate_genes.pl ...... 171

Appendix 3.7 parse_m9.pl ...... 173

Appendix 3.8 make_simple_table_v2.pl ...... 174

Appendix 3.9 summarize_with_files_v2.pl ...... 177

Appendix 3.10 find_genes_of_diff_length.pl ...... 189 xxiii

Appendix 3.11 compare_phi_results.pl ...... 191

Appendix 3.12 check_proximity.pl...... 196

Appendix 3.13 eliminate_duplicates.pl...... 198

Appendix 3.14 Alignment of putative M. bolleyi 07020 EF-1α sequence with the Microdochium

EF-1α primers EFNivF and EFMajF and the reverse complement of primer EFMicR ...... 199

Appendix 3.15 Alignment of predicted gene sequences that are putatively unique to

Microdochium spp...... 201

Appendix 4.1Sample input for linkage disequilibrium calculation with Arlequin ...... 268

Appendix 4.2 Banding patterns detected with primer groups studied (n= 136). Multiple rows with the same number of bands represent different banding patterns based on band sizes...... 273

Appendix 5.1 Alignment of MAT1-1-1 sequences collected from GenBank for primer design 344

Appendix 5.2 Alignment of MAT1-2-1 sequences collected from GenBank for primer design.

...... 353

Appendix 5.3 Alignment of SLA2 coding sequences. Note that trailing sequence has been truncated...... 355

Appendix 5.4 Alignment of MAT1-2-1 coding sequences ...... 362

Appendix 5.5 Alignment of APN2 coding sequences ...... 366

Appendix 5.6 Alignment of M. majus 99049 MAT-region coding sequences with those of M. nivale 11037, with primer loci indicated for SLA2 ...... 371

Appendix 5.7 Alignment of M. majus 99049 MAT-region coding sequences with those of M. nivale 11037, with primer loci indicated for MAT1-2-1 ...... 376

Appendix 5.8 Alignment of M. majus 99049 MAT-region coding sequences with those of M. nivale 11037, with primer loci indicated for APN2 ...... 380 xxiv

Appendix 5.9 Observed ascospore dimensions ...... 384

Appendix 5.10 Supplemental Results: Test of published mating type primers and redesigned universal primers ...... 385

Appendix 5.11 Supplemental results: Identification of flanking genes in May 2011 99049 assembly ...... 387

Appendix 6.1: SAS Statements ...... 426

xxv

LIST OF ABBREVIATIONS AND ACRONYMS

AFLP: Amplified Fragment Length Polymorphism BLASTn: Basic Local Alignment Search Tool (nucleotide vs. nucleotides) BLASTp: Basic Local Alignment Search Tool (protein vs. proteins) BLASTx: Basic Local Alignment Search Tool (translated nucleotide vs. proteins) bp: Base Pair(s) d: day ddNTP: DiDeoxyNucleotide TriPhosphate DNA: DeoxyriboNucleic Acid dNTP: DeoxyNucleotide TriPhosphate dpi: Days Post-Inoculation EDTA: EthyleneDiamineTetraacetic Acid EF-1α: Elongation Factor-1α FHB: Fusarium Head Blight IA: Index of Association ISSR: Inter Simple Sequence Repeat ITS: Internal Transcribed Spacer JGI: Joint Genome Institute h: Hours HMG: High-Motility Group hpi: Hours Post-Inoculation HW: Hardy-Weinberg KB: Kentucky Bluegrass (Poa pratensis) LSD: Lab Services Division LSU: Large Subunit MAT: MAting Type Mb: MegaBase(s) (i.e. 106 base pairs) min: Minute(s) ML: Maximum Likelihood MP: Maximum Parsimony mRNA: Messenger RiboNucelic Acid N50: the contig size at which 50% of all bases in the assembly are contained in contigs / scaffolds that are larger or smaller than this value NCBI: National Center for Biotechnology Information NIH: National Institutes of Health NJ: Neighbour-Joining NGS: Next-Generation Sequencing PCR: Polymerase Chain Reaction Pfam: putataive Protein FAMily PHI: Pathogen-Host Interaction PDA: Potato Dextrose Agar rDNA: Ribosomal DeoxyriboNucleic Acid rRNA: Ribosomal RiboNucleic Acid RIP: Repeat-Induced Point mutation xxvi

RNA: RiboNucleic Acid RPB2: Ribonucleic acid Polymerase binding Protein, 2nd largest subunit rpm: Revolutions Per Minute s: Second(s) SA: Single-Ascospore SBS: Sequencing By Synthesis SDS: Sodium Dodecyl Sulfate SNP: Single-Nucleotide Polymorphism SSU: Small SubUnit TBE: Tris Borate EDTA tBLASTn: Basic Local Alignment Search Tool (protein vs. translated nucleotides) tBLASTx: Basic Local Alignment Search Tool (translated nucleotides vs. translated nucleotides) TE: Tris EDTA TE: Transposable Element Tris: tris(hydroxymethyl)aminomethane × g: times gravity (relative centrifugal force) UK: United Kingdom UPGMA: Unweighted Pair Group Method with Arithmetic mean USA: United States of America UV: Ultra Violet v/v: volume of solute per volume of solvent (percent) VCG: Vegetative Compatibility Group W: Wheat (Triticum aestivum) w/v: weight of solute per volume of solvent (percent)

xxvii

Chapter 1 General Introduction & Literature Review

1.1 Introduction

Plants are necessary not only as food for humans, wild animals, insects, and livestock, but they also perform critical ecological functions, provide materials for construction and textiles, and are valuable sources for both traditional and allopathic medicines. In addition to these vital roles, plants are part of our recreational environments such as sports fields, gardens, and forests.

Plants, like all other living things, are subject to disease. Plant pathology is the multidisciplinary field that encompasses the diseases of plants, their causes, mechanisms, and treatments (Agrios

2005). A plant is generally understood to be diseased when its ability to grow or reproduce normally is inhibited by any of a number of different factors, including abiotic threats such as an imbalance in moisture or nutrient levels or by biotic factors including viruses, bacteria, and fungi. Of these threats, fungi are the most common cause of plant diseases (Agrios 2005).

The first fungicide was Bordeaux mixture, a mixture of copper sulfate and calcium hydroxide, which in 1885 was found to suppress powdery mildew on grapes (Agrios 2005).

Since this time, a variety of natural and synthetic have been discovered. However, insufficient understanding of the dangers of some of the chemicals that were used as early fungicides, especially mercurial compounds (Agrios 2005), has contributed to a negative perception of fungicides among the general public (Gullino and Kuijpers 1994; Ragsdale and

Sisler 1994). In 2008, the Ontario government banned the use of fungicides and other pesticides for cosmetic uses (Anonymous 2008), although golf courses were exempt from this ban. These changes, in addition to restrictions on individual fungicides (Anonymous 2010), suggest that fungicide use in the future will become more restricted. To facilitate the development of

1

pathogen-targeted controls, further research into the basic biology of plant pathogens is necessary.

The intensity of fungicide use on golf courses is among the highest in any agricultural or horticultural sector based on the rate per hectare (Anonymous 1998), and by area, more money is spent for turfgrass disease control than for any other cultivated plant (Nelson 1992). In Canada, the greatest single use of fungicides is for snow mould control across the country (Hsiang et al.

1999), and snow moulds are the most economically important turfgrass diseases (Jung et al.

2007). Among the snow mould diseases, grey snow mould caused by Typhula species, and pink snow mould caused by Microdochium species are the most important (Jung et al. 2007). Pink snow mould and other diseases caused by Microdochium on turfgrasses are likely the most common turfgrass diseases in cool, wet climates (Hsiang 2009).

1.2 General information about Microdochuim nivale and M. majus

Microdochium nivale (Fr.) Samuels & Hallett and M. majus (Wollenw.) Glynn & S.G.

Edwards1 are ascomycete fungal plant pathogens found worldwide in cool and temperate regions

(Kammoun et al. 2009; Lees et al. 1995; Nakajima and Naito 1995; Waalwijk et al. 2003). Both species attack cereals (e.g., Hordeum vulgare L., Triticum spp.), and M. nivale is also a pathogen of cool-season grasses (e.g., Agrostis, Lolium, and Poa) (Simpson et al. 2000). Whereas most plant pathogens in temparate regions are not active during the (Agrios 2005), M. nivale

1 These species were previously known as Fusarium nivale (Fr.) Sorauer (or F. nivale Ces. ex Berl. & Voglino) and F. nivale var. majus Wollenw., or Microdochium nivale var. nivale and M. nivale var. majus (Wollenw.) Samuels & I.C. Hallett, respectively (Glynn et al, 2005). 2

and M. majus continue to grow at temperatures as low as -5 °C (Snider et al. 2000). This adaptation allows these fungi to attack their host plants under snow cover, and pink snow mould is caused by these pathogens.

Pink snow mould is a common disease on graminaceous plants with live aerial parts that are covered by snow for at least six weeks, and is characterized by bleached to orange-brown patches of matted leaf tissue up to 20 cm in diameter which may also display white or pink (Figure 1.1) (Hsiang 2009). Following wet and cool weather (between 0-15 °C), M. nivale may also cause a disease on grasses known as , also called Microdochium patch. This disease is characterized by irregular bleached or brown-orange patches up to 5 cm in diameter, the edges of which may be bronze in colour when the pathogen is actively growing

(Hsiang 2009). Indivdual patches may coalesce, resulting in large areas of damaged turf. This disease is a serious problem on managed turf surfaces, such as golf courses (Smiley et al. 2005).

In addition to pink snow mould and Fusarium patch, M. nivale and M. majus are also among the causative agents of Fusarium head blight (FHB), a common and serious disease of wheat and barley (Ioos et al. 2004). Although FHB can be caused by many different pathogens (Pirgozliev et al. 2003), notably Fusarium culmorum and F. graminearum, M. nivale sensu lato is among the most common and severe causal agents (Ioos et al. 2004). European studies have suggested that the incidence of M. nivale sensu lato in disease outbreaks has increased in recent years (Ioos et al. 2004). Cereals afflicted with FHB turn brown and wither, and infected kernels are lower in biomass and nutritive value relative to unaffected crops. In severe outbreaks, crop yields may be reduced by 50% (Windels 1999).

The prevention of snow mould diseases, including pink snow mould, accounts for almost

50% of the yearly fungicide use on turfgrass in Canada (Hsiang et al. 1999). Unfortunately, 3

despite such management efforts, M. nivale and M. majus remain common and serious pathogens on wheat and turf in Canada and Europe. Recent studies have demonstrated resistance among M. nivale to the strobilurin fungicides that were effective in the past (Walker et al. 2009).

1.2.1 Disease Cycle Microdochium nivale and M. majus are most likely spread by the dispersal of infected plant materials, including seeds and plant debris such as stem cuttings (Snyder and Nash 1968) which may be left behind following harvest or maintenance. These pathogens persist and grow in , and mycelia in particular are known to cause rapid and severe infection relative to conidia

(Pronczuk and Messyasz 1991). Ascospores may also constitute an important source of primary inoculum (Mahuku et al. 1998). Infected seeds (Cristani 1992; Humphreys et al. 1995, 1998) may also spread disease from one field to another, although seeds are now regularly tested for fungal infestation and treated with fungicide to mitigate this problem (Glynn et al. 2008). While many fungal plant pathogens use specialized structures or thick-walled to survive the winter (e.g., chlamydospores, sclerotia, or ascospores) (Agrios 2005), Microdochium nivale sensu lato is a psychrophilic organism capable of growth between approximately -5 and 30°C

(Snider et al. 2000), producing mycelium and actively damaging plant tissues under the snow and during the early and late fall in temperate climates (Hoshino et al. 2009).

To thrive at cool temperatures, M. nivale sensu lato is known to modify the composition of its lipid membrane by incorporating additional triacylglycerols containing unsaturated fatty acids, especially linolenic (18:3) and linolenic (18:2) acids when subjected to low temperatures

(Istokovics et al. 1998). It also possesses an unusual betaine lipid that is also found in the snow- mould-causing fungi Typhula ishikarienesis and T. incarnata, although the specific function of

4

this lipid is unknown (Istokovics et al. 1998). Unlike other cold-tolerant organisms, such as some species of bacteria and amphibians, Microdochium nivale sensu lato is not known to produce antifreeze proteins to prevent the formation of ice crystals at temperatures below 0° C (Snider et al. 2000).

1.2.2 Phylogenetic classification of M. nivale and M. majus Microdochium nivale was first identified by Fries (1825), and was given the name Lanosa nivalis. This organism was reclassified and placed in the genus Fusarium as F. nivale (Fr.)

Sorauer (1901); however the same name had been used based on a separate type specimen,

Fusarium nivale Ces. ex Berlese & Voglino (1886). Wollenweber (1930) differentiated a new variety, F. nivale var. majus, from the type variety F. nivale var. nivale. The lack of a conidial foot cell, a key feature in the genus Fusarium, was used as an important character to support the reclassification of the as Gerlachia nivale (Gams and Müller 1980). This classification was contested by Samuels and Hallett, who synonymised the genus Gerlachia W. Gams & E.

Müll.with the older genus Microdochium Syd. & P. Syd. (1983). A recent examination of the elongation factor 1-α gene, in addition to the morphological and pathogenic differences described elsewhere in the literature (section 1.2), led to the reclassification of Microdochium nivale var. nivale and M. nivale var. majus as separate species. The former retained the name M. nivale, while the latter was named M. majus (Glynn et al. 2005).

Under the now-outdated dual nomenclature system for fungi (Taylor 2011), the sexual states

(teleomorphs) of M. nivale sensu lato have had a similar history of reclassification. The sexual stage of Fusarium nivale was named nivalis Schaffnit in 1913. This name is a synonym of the older name Nectriella graminicola (Berk. & Broome) Niessl. The teleomorph

5

was then transferred to the genus Griphosphaeria by Muller & von Arx (1955) due to its production of subepidermal, darkly-pigmented perithecia and to Micronectriella by Booth

(1971), despite the presence of unitunicate asci in M. nivalis (Samuels and Hallett 1983), and bitunicate asci in the type species for Micronectriella (Shoemaker 1981). The most recently accepted teleomorph names for M. nivale and M. majus were nivalis (Mueller

1977) and Monographella nivalis var. neglecta (Gams and Müller 1980). The recent "one name - one fungus" resolution of the International Biological Congress has eliminated the dual nomenclature system for fungi (Taylor 2011). Although it is possible that the names of M. nivale and M. majus may change as a result of this policy, no decision has yet been reached for these species (K. Seifert, personal communication; T. Gräfenhan, personal communication). For this reason, I use the more familiar names Microdochium nivale and M. majus that traditionally have been associated with the asexual stage (anamorph).

1.3 Differences between M. majus and M. nivale

Although M. nivale and M. majus were regarded as conspecific until 2005 (Glynn et al.

2005), consistent differences between these organisms have been known for over 80 years

(Wollenweber 1930). Differences exist between M. majus and M. nivale with respect to host preferences (Simpson et al. 2000), morphology (Litschko and Burpee 1987), production of sexual spores (Smith 1983), and genetic information (Glynn et al. 2005; Maurin et al. 1995;

Nicholson et al. 1996; Parry et al. 1995).

6

1.3.1 Morphological characteristics On potato dextrose agar (PDA), the hyphae of Microdochium majus and M. nivale are white to salmon-pink in colour, with a growth rate between 0.13-0.37 mm/h at 20 °C (Litschko and

Burpee 1987). Although initially considered to be different varieties of M. nivale sensu lato, M. majus and M. nivale have been recognized as distinct since 1930 based primarily on the difference between their average conidial size (Wollenweber). The conidia of M. majus are larger in size (width from 4.2-6.0 µm, length 15-33 µm) than those of M. nivale (width no larger than 3.8 µm, length 8-27 µm), and possess more septa (1-7 compared to 0-3) (Gerlach 1982).

However, this wide range of characters overlaps, and some researchers have found that many individual isolates of these species fall within an ambiguous range, rendering morphological differentiation alone unreliable (Lees et al. 1995; Litschko and Burpee 1987).

The production of ascospores has been infrequently observed, but some differences may exist between conditions favouring perithecial formation in these two species, where M. majus appears to develop perithecia more readily than M. nivale (Litschko and Burpee 1987). Nevertheless, conidial variation, despite overlapping attributes, remains the only non-molecular physical characteristic consistently available for distinguishing between these species.

1.3.2 Pathogenic differences Although both M. nivale and M. majus are known pathogens of graminaceous plants, consistent differences have been observed in the distribution of these pathogens among their hosts. When inoculated alone or competitively on wheat, oat, and rye seedlings, both M. nivale and M. majus were capable of causing disease on each host; however, in the mixed inoculation experiments M. nivale strongly outcompeted M. majus on rye while M. majus preferentially

7

colonized wheat (Simpson et al. 2000). Similarly, in a detached-leaf assay, M. majus displayed a faster colonization rate and induced larger lesions than M. nivale (Diamond and Cooke 1997).

When evaluating the extent of fungal colonization on three cultivars of , M. nivale caused more serious infection than M. majus (Hofgaard et al. 2006). Populations of M. nivale living on different hosts may represent specialized individuals with little gene flow between populations; indeed, M. nivale is known to possess a wider host range and exhibits a higher level of genetic diversity relative to M. majus (Mahuku et al. 1998). The relationship between M. nivale's genetic diversity and its wide host range has not been well-explored.

1.3.3 Genetic differences Sequence differences between M. nivale and M. majus have been identified in a number of genetic regions. Specific RFLP profiles of the ITS region (Lees et al. 1995) have been described for M. nivale and M. majus. Sets of primers designed to amplify only one variety have been developed in an unspecified genetic region (Nicholson et al. 1996) and in the gene for elongation factor 1α (EF-1α) (Glynn et al. 2005). In addition to functioning as a useful tool for the identification of new isolates from the field, these studies have demonstrated the magnitude of variation between M. nivale and M. majus. Some genetic differences connected to host origin have also been noted. In a study of M. nivale isolates from turfgrass, fungal isolates from different host species were more similar to isolates which shared a host than those which were found in close proximity (Mahuku et al. 1998). Conversely, a primer set designed by European researchers to differentiate between M. nivale and M. majus was unable to amplify DNA from

North American M. nivale isolates (Glynn et al. 2005). This observation, in combination with the

8

other differences described above, lead Glynn et al. (2005) to propose that the two varieties M. nivale var. nivale and M. nivale var. majus represent distinct species.

1.4 Sexual Reproduction

1.4.1 Sexual reproduction in the Ascomycota Fungi may be capable of both sexual and . Asexual propagation may occur in several different ways: via the production of mitospores such as conidia; through the dispersal of hyphal fragments; or through the production of resting structures, such as sclerotia.

Sexual reproduction in fungi involves the production of meiospores with novel genetic information (Taylor et al. 1999). Asexual reproduction is common in fungi (Fisher 2007), and some species are only known to proliferate in this manner. Such species were previously classified in the Deuteromycetes since they could not be placed with their closest relatives in earlier classification systems that were based on the morphology of sexual reproductive structures (Schoch et al. 2009). The Deuteromycetes are now known to be a polyphyletic group; this taxon has been abandoned and its members are gradually being placed with their true relatives in a system based on evolutionary relationships (Taylor, 2011).

In contrast to the sexes understood to exist in animals, fungi exhibit a complex mating system that may be controlled by several genes at multiple loci within an individual's genome. These genes are referred to as mating type genes (MAT genes). Among the Ascomycota, there are two different mating types. These genes have been referred to by several different names, including α and a (especially in yeast), + and –, or MAT1-1 and MAT1-2 (Casselton 2008). While both mating type genes may occupy the same genetic region, they are referred to as idiomorphs rather than alleles to emphasize their extremely divergent nature. The mating type regions appear to

9

code for transcription factors, which in turn regulate the transcription of genes pertaining to the synthesis of hormones, proteins and compounds directly relevant to sexual reproduction

(Pöggeler 2000). The two mating type idiomorphs may be identified by their DNA-binding motifs: generally, the MAT1-1 gene contains an α-box motif, while the MAT1-2 gene contains an HMG box (Arie et al. 1999). This observation facilitates the discovery of mating type genes in new species, because the remainder of the gene is often highly divergent (Taylor et al. 1999).

There are three broad categories of sexual reproduction employed by fungi: heterothallism, homothallism, and pseudohomothallism (Butler 2007). In heterothallic fungi, any one individual possesses only a single MAT idiomorph (either MAT1-1 or MAT1-2), and successful sexual reproduction requires two individuals of opposite mating types. Homothallic individuals are self- fertile, and possess both MAT1-1 and MAT1-2 within the same nucleus (although the genes themselves are not necessarily adjacent or in close proximity to each other). Pseudohomothallic individuals possess both MAT1-1 and MAT1-2, and are generally self-fertile, producing ascospores that possess both MAT genes (Merino et al. 1996). However, these genes are carried within separate nuclei. Ascospores that develop into self-fertile individuals receive both nuclei, but, infrequently, ascospores containing only a single nucleus are produced. In this manner, some self-sterile individuals are produced. These self-sterile individuals require an individual of the opposite mating type for successful sexual reproduction, as in heterothallic species (Merino et al.

1996).

Modern genetic techniques have facilitated the identification of cryptic sexual reproduction in populations previously believed to be strictly asexual (Arie et al. 2000). One expected quality of a truly asexual population would be relatively low genetic diversity, as new genetic information would be expected to arise from mutation alone, rather than via meiotic 10

recombination as in sexual populations (Taylor et al. 1999), although a large enough asexually reproducing population may possess a similar level of genetic diversity relative to a sexual population (e.g., (Groth et al. 1995).

1.4.2 Sexual reproduction in Microdochium nivale and M. majus The relatively large genetic variability observed in M. nivale is consistent with a sexual population, but sexual reproduction has rarely been observed in this species (Booth 1971; Cook and Bruehl 1966; Mahuku et al. 1998). Both M. nivale and M. majus produce perithecia in the lab (Lees et al. 1995; Litschko and Burpee 1987; Parry et al. 1995); however, M. majus appears to produce perithecia more readily than M. nivale, as it appears to be homothallic (self-fertile).

Old literature, which may have been referring to M. majus, describes M. nivale as homothallic; however, more recent research has suggested that M. nivale sensu stricto may actually be heterothallic, requiring individuals of opposite mating types for successful reproduction (Lees et al. 1995). If true, this could help to explain the relatively large degree of heterogeneity observed in M. nivale relative to M. majus. Furthermore, as both species actively cause disease at low temperatures, ascospores may serve as survival spores during warm, dry conditions which do not favour infection by these pathogens.

If M. majus is truly homothallic while M. nivale is heterothallic, then both mating type idiomorphs should be found in the genome of a single M. majus individual while only a single idiomorph would be expected in a M. nivale individual. The identification of mating type genes in these species has not yet been reported in the literature, but could be useful in understanding the degree of genetic diversity present in this pathogen. This information would be timely as resistance to the strobilurin pesticides has been recently observed (Walker et al. 2009);

11

understanding the mechanism of how the genes responsible for resistance may spread through the population could help to mitigate the transmission of this potentially harmful mutation.

1.5 Sequencing techniques and bioinformatics

1.5.1 DNA sequencing techniques Beginning with the first viral genome sequenced in 1977 (Sanger et al. 1977a), whole- genome sequencing has allowed researchers to better understand individual species and the relationships between them. Although several different sequencing methods were explored by researchers in the 1970s, the automation of the Sanger method allowed this technique to become the method of choice for most sequencing applications until the early 21st century (Shendure and

Ji 2008). In Sanger sequencing, the sequence of interest is amplified in a buffered solution containing a mixture of standard deoxynucleotides (dNTPs) and chain-terminating dideoxynucleotides (ddNTPs) in the presence of a DNA polymerase enzyme and a primer that is specific to the sequence of interest.

When the method was first developed, the ddNTPs were labelled radioactively, and separate reactions were conducted for each of the four nucleotides (Sanger et al. 1977b). Within each reaction mixture, as the polymerase synthesizes a copy of the sequence of interest, the labelled ddNTP will be randomly incorporated into the sequence in the place of the correct dNTP. As this process is repeated on the millions of template DNA molecules included in the reaction mixture, the amplification products will consist of truncated sequences that vary in length based upon when (or if) the labelled nucleotide was incorporated into the growing product. This mixture of sequences can then be separated according to size using gel electrophoresis and, in combination with the results from the other labelled nucleotides, the sequence of the original template strand

12

may be determined. Modern methods use ddNTPs labelled with four different fluorescent dyes, which eliminates the need to run a separate reaction for each nucleotide and also facilitates automatic detection (Shendure and Ji 2008).

The Sanger method has been used to generate full genome sequences for an assortment of diverse species, including humans. Although Sanger sequencing remains in wide use, a number of alternative sequencing strategies have become available in the past 10 years, and these techniques offer several improvements over the Sanger sequencing method in terms of both the cost and the length of time required to produce data (see section 3.1) (Lister et al. 2009). These newer technologies are broadly referred to as "next-generation" sequencing (NGS) (Shendure and Ji 2008). These techniques are more prone to certain types of sequencing error than Sanger sequencing, and the sequence fragments produced are generally shorter than those produced by the Sanger method, but they can be produced much less expensively and allow for much greater coverage (Shendure and Ji 2008). Advances in data processing methods (see section 1.5.3) have allowed sequence reads from these newer techniques to be used in de novo genome assembly to produce complete genome sequences for eukaryotes, including fungi (Nowrousian et al. 2010), plants (Shulaev et al. 2011), and mammals (Li et al. 2009).

Filamentous ascomycetes are particularly well-suited for whole-genome sequencing because of their monokaryotic haploid vegetative growth (Webster and Weber 2007), which facilitates the collection of a large amount of genetically homogenous material for sequencing. In addition, fungal genomes are typically less than 100 Mb in size (Haridas et al. 2011), which decreases both the complexity of the assembly process and the cost and resources required to obtain sufficient information for successful assembly.

13

1.5.2 RNA sequencing While the analysis of DNA sequences facilitates the analysis of relationships between different species or individuals, the identity and the abundance of RNA transcripts describes the priorities of an individual organism under a specific set of conditions (Wang et al. 2009). Within the context of a plant-pathogen interaction, both the host plant and the pathogen undergo changes in protein synthesis when the host is challenged by a pathogen. For example, the host may synthesize defensive or reparative enzymes while the pathogen may secrete cellulases and other enzymes responsible for degrading the host plant (Conrath et al. 2002; Dickinson 2003; Jones and Dangl 2006). The transcriptome of an organism is defined as the totality of its RNA pool under a specific set of conditions, including both the identities and the quantities of all transcripts

(Wang et al. 2009). By comparing the transcriptome of healthy to that of diseased host plant cells, it may be possible to uncover genes which play a role in the plant’s defensive response. In addition to identifying differences in the identities of the mRNA transcripts present, it is also possible to identify differences in the relative quantities of each transcript (Li et al. 2010). If an organism is examined at different times, or in response to different stimuli, it may be possible to identify genes that have been up- or down-regulated in response to these changes.

Transcriptome analysis generates sequence information for all of the mRNA transcripts present in the organism at the time of sampling (Grabherr et al. 2011), within the limits of detection of the technique used. This powerful technique has been made possible through the recent NGS developments discussed above, and offers many advantages over previously- developed transcriptome analysis techniques such as northern blotting, microarrays, and quantitative reverse transcriptase PCR. These older techniques required prior knowledge of the sequences of interest, or of specific probes or primers, or were limited by their range of detection

14

(Wang et al. 2009). These restrictions made it difficult to detect transcripts which may be infrequent or unique, whereas NGS techniques excel at very high levels of coverage which might reveal such transcripts especially through the use of in silico subtraction techniques.

Transcriptome sequencing using NGS techniques, known as RNA-Seq (Nagalakshimi et al.

2008), necessitates the extraction of the RNA pool of an organism. The mRNA is sequestered by affinity chromatography targeting the poly-A tail that is added in post-transcriptional modification (Nelson and Cox 2004). The mRNA is then used as a template to synthesize cDNA using reverse transcriptase, and NGS techniques are used to sequence the entire cDNA pool.

The power of RNA-Seq has been demonstrated by generating sequence libraries for yeast both under vegetative (Nagalakshimi et al. 2008) and meiotic conditions (Wilhelm et al. 2008).

Comparisons between the transcriptomes of wild-type versus perturbed organisms have been made by many researchers using different experimental conditions. For example, the small RNA

(smRNA) complements of cold-tolerant and cold-susceptible Triticum aestivum challenged with cold temperatures were investigated, and researchers found significant differences in transcript levels between the cultivar types (Qin et al. 2008). This validation of transcriptome analysis suggests that further studies could reveal important molecular processes involved in disease and stress tolerance. Importantly, both Microdochium majus and M. nivale are known wheat pathogens, but when grown competitively, M. majus out-competes M. nivale (Simpson et al.

2000): this distinction is likely linked to differences within the pathogens themselves as well as in the response mounted by the host. Transcriptome analysis of wheat that has been inoculated with different pathogens may reveal differences in host response which allow M. majus to colonize more successfully; additionally, by identifying genes and processes important in the

15

wheat’s defensive response, it may be possible to develop effective breeding programs to select for cultivars that overexpress those genes important for resistance.

1.6 Hypotheses and Objectives

Hypotheses

1. An examination of multiple genes will support the reclassification of M. nivale and M.

majus as distinct species. Furthermore, an examination of the genomes of these

organisms will reveal further differences between these species. Populations of M. nivale

and M. majus in North America are genetically distinct from those found in Europe, and

M. nivale isolates found on turfgrass are genetically distinct from those found on wheat.

Repeated sampling of M. nivale populations over several years will reveal year-to-year

genetic variability at a level that is consistent with a sexually reproducing population.

2. Mating-type genes for M. majus and M. nivale will be revealed by genome sequencing,

and the alternate idiomorph can be uncovered by sequencing the genes surrounding the

mating type region in other isolates.

3. Differences in host preferences reported for M. nivale and M. majus will be reflected in

differing infection processes or timing when grown on Kentucky bluegrass (Poa

pratensis) or on wheat (Triticum aestivum).

Objectives

1a. Obtain isolates of M. nivale and M. majus from around the world and from different hosts

(both wheat and turfgrasses). Amplify and sequence a variety of protein-coding and non-

protein-coding genes. Analyze the sequence differences between M. majus and M. nivale, 16

and compare these differences to those observed for an outgroup species. Use this

information to assess whether M. nivale and M. majus possess sufficient heterogeneity to

justify their elevation to species, and to assess the presence or absence of sequence

divergence among different host and geographic populations. (Chapter 2)

1b. Obtain complete genome sequences for one isolate of M. nivale from wheat, one isolate of M.

nivale from turfgrass, and one isolate of M. majus from wheat using NGS, and identify

divergent regions between the genomes of M. nivale and M. majus to further assess the

elevation to full species and to search for genes that may be unique to these two species,

individually and collectively. (Chapter 3)

1 c. Assess genotype performance by examining genetic diversity among and between

populations sampled over multiple years using ISSR markers. (Chapter 4)

2. Identify the mating-type genes within the M. nivale and M. majus genomes based on

sequences obtained from other filamentous ascomycetes. Sequence and compare mating

types from individual isolates of M. nivale and M. majus, and screen the in-lab isolate

collections of these species to classify isolates according to mating type. Attempt in vitro

crosses between individuals of opposite mating types (if available) and study mating

process microscopically if mating can be induced. (Chapters 3 and 5)

3. Inoculate detached leaves of Kentucky bluegrass and wheat with hyphal suspensions of M.

nivale and M. majus to study the infection process Collect leaf samples and examine

microscopically to determine the timing and mechanism of infection for these pathogens

on two different host plants. (Chapter 6) 17

1.7 References for Chapter 1

Agrios, G.N. 2005. Plant Pathology. Elsevier Academic Press, Burlington, MA. Anonymous. 1998. Golf Course Pesticide Use and Monitoring. Alberta Environmental Protection, Chemicals Assessment and Management Division, Pesticide Management Branch, Edmonton. Anonymous. 2008. An Act to amend the Pesticides Act to prohibit the use and sale of pesticides that may be used for cosmetic purposes. Edited by Province of Ontario. Anonymous. 2010. Re-evaluation Decision RVD2010-06, Quintozene. In 100327. Pest Management Regulatory Agency, Ottawa, Canada. Arie, T., Kaneko, I., Yoshida, T., Noguchi, M., Nomura, Y., and Yamaguchi, I. 2000. Mating- Type Genes from Asexual Phytopathogenic Ascomycetes Fusarium oxysporum and Alternaria alternaria. Molecular Plant-Microbe Interactions 13: 1330-1339. Arie, T., Yoshida, T., Shimizu, T., Kawabe, M., Yoneyama, K., and Yamaguchi, I. 1999. Assessment of Giberella fujikuroi mating type by PCR. Mycoscience 40: 311-314. Berlese, A.N., and Volglino, P. 1886. in Rabenhorst, L. (1855). Klotzschii Herbarium vivum mycologicum, Ed. 1, no. 1439. In Sylloge Fungorum. Additamenta ad Volumina I-IV. pp. 1-484. Booth, C. 1971. The Genus Fusarium. Commonwealth Mycological Institute, Kew, UK. Butler, G. 2007. The Evolution of MAT: The Ascomycetes. In Sex in Fungi: Molecular Determination and Evolutionary Implications. Edited by J. Heitman, Kronstad, J.W., Taylor, J.W., and Casselton, L.A. ASM Press, Washington, D.C. Casselton, L.A. 2008. Fungal sex genes - searching for the ancestors. BioEssays 30(8): 711-714. Conrath, U., Corné, M., and Mauch-Mani, B. 2002. Priming in plant-pathogen interactions. Trends in Plant Science 7: 210-216. Cook, R.J., and Bruehl, G.W. 1966. Calonectria nivalis, perfect stage of Fusarium nivale, occurs in the field in North America. Phytopathological Notes 54: 1100-1101. Cristani, C. 1992. Seed-borne Microdochium nivale (Ces. ex Sacc.) Samuels (= Fusarium nivale (Fr.) Ces.) in naturally infected seeds of wheat and Triticale in Italy. Seed Science and Technology 20(3): 603-617. Diamond, H., and Cooke, B.M. 1997. Host specialisation in Microdochium nivale on cereals. Cereal Research Communications 25(3): 533-538. Dickinson, M. 2003. Resistance mechanisms in plants. In Molecular Plant Pathology. BIOS Scientific Publishers, London. pp. 145-158. Fisher, M.C. 2007. The evolutionary implications of an asexual lifestyle manifested by Penicillium marneffei. In Sex in Fungi: Molecular Determination and Evolutionary Implications. Edited by J. Heitman, Kronstad, J.W., Taylor, J.W., and Casselton, L.A. ASM Press, Washington, D.C. Fries, E.M. 1825. Systema Orbis Vegetabilis 1: i-viii. Typographica Academica, Sweeden, Lund. Gams, W., and Müller, E. 1980. Conidiogenesis of Fusarium nivale and Rhynchosporium oryzae and its taxonomic implications Netherlands Journal of Plant Pathology 86: 45-53. Gerlach, W., and Nirenberg, H. 1982. The genus Fusarium, a Pictorial Atlas. Mitteilungen aus der Biologischen Bundesanstalt fur Land-und Fortswirtschaft, Berlin-Dahlem 209: 406 pp.

18

Glynn, N.C., Hare, M.C., and Edwards, S.G. 2008. Fungicide seed treatment efficacy against Microdochium nivale and M. majus in vitro and in vivo. Pest Management Science 64(8): 793-799. Glynn, N.C., Hare, M.C., Parry, D.W., and Edwards, S.G. 2005. Phylogenetic analysis of EF-1 alpha gene sequences from isolates of Microdochium nivale leads to elevation of varieties majus and nivale to species status. Mycological Research 109: 872-880. Grabherr, M.G., Haas, B.J., Yassour, M., Levin, J.Z., Thompson, D.A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N., Gnirke, A., Rhind, N., di Palma, F., Birren, B.W., Nusbaum, C., Lindblad-Toh, K., Friedman, N., and Regev, A. 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology 29: 644-652. Groth, J.V., McCain, J.W., and Roelfs, A.P. 1995. Virulence and isozyme diversity of sexual versus asexual collections of Uromyces appendiculatus (bean rust fungus). Heredity 75: 234-242. Gullino, M.L., and Kuijpers, L.A.M. 1994. Social and political implications of managing plant diseases with restricted fungicides in Europe. Annual Review of Phytopathology 32: 559- 579. Haridas, S., Breuill, C., Bohlmann, J., and Hsiang, T. 2011. A biologist's guide to de novo genome assembly using next-generation sequence data: A test with fungal genomes. Journal of Microbiological Methods 86: 368-375. Hofgaard, I.S., Wanner, L.A., Hageskal, G., Henriksen, B., Klemsdal, S.S., and Tronsmo, A.M. 2006. Isolates of Microdochium nivale and M. majus differentiated by pathogenicity on perennial ryegrass (Lolium perenne L.) and in vitro growth at low temperature. Journal of Phytopathology 154(5): 267-274. Hoshino, T., Xiao, N., and Tkachenko, O.B. 2009. Cold adaptation in the phytopathogenic fungi causing snow molds. Mycoscience 50(1): 26-38. Hsiang, T. 2009. All you ever wanted to know about Fusarium patch / Microdochium patch / pink snow mold or whatever that disease is called. Green Master 44(4): 13-16. Hsiang, T., Matsumoto, N., and Millett, S.M. 1999. Biology and management of Typhula snow molds of turfgrass. Plant Disease 83(9): 788-798. Humphreys, J., Cooke, B.M., and Storey, T. 1995. Effects of seed-borne Microdochium nivale on establishment and grain yield of winter-sown wheat. Plant Varieties and Seeds 8(2): 107-117. Humphreys, J., Cooke, B.M., and Storey, T. 1998. Effects of seed-borne Microdochium nivale on establishment and population density at harvest of winter-sown oats. Plant Varieties and Seeds 11(2): 83-90. Ioos, R., Belhadj, A., and Menez, M. 2004. Occurrence and distribution of Microdochium nivale and Fusarium species isolated from barley, durum and soft wheat grains in France from 2000 to 2002. Mycopathologia 158(3): 351-362. Istokovics, A., Morita, N., Izumi, K., Hoshino, T., Yumoto, I., Sawada, M.T., Ishizaki, K., and Okuyama, H. 1998. Neutral lipids, phospholipids, and a betaine lipid of the snow mold fungus Microdochium nivale. Canadian Journal of Microbiology 44(11): 1051-1059. Jones, J., and Dangl, J. 2006. The plant immune system. Nature 444: 323-329. Jung, G., Chang, S.W., and Jo, Y.-K. 2007. A fresh look at fungicides for snow mold control. Golf Course Management 75(7): 91-94. 19

Kammoun, L.G., Gargouri, S., Hajlaoui, M.R., and Marrakchi, M. 2009. Occurrence and distribution of Microdochium and Fusarium species isolated from durum wheat in northern Tunisia and detection of mycotoxins in naturally infested grain. Journal of Phytopathology 157(9): 546-551. Lees, A.K., Nicholson, P., Rezanoor, H.N., and Parry, D.W. 1995. Analysis of variation within Microdochium nivale from wheat - evidence for a distinct subgroup. Mycological Research 99: 103-109. Li, B., Ruotti, V., Stewart, R.M., Thomson, J.A., and Dewey, C.N. 2010. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26(4): 493-500. Li, R., Fan, W., Tian, G., Zhu, H., He, L., Cai, J., Huang, Q., Cai, Q., Li, B., Bai, Y., Zhang, Z., Zhang, Y., Wang, W., Li, J., Wei, F., Li, H., Jian, M., Li, J., Zhang, Z., Nielsen, R., Li, D., Gu, W., Yang, Z., Xuan, Z., Ryder, O.A., Leung, F.C.-C., Zhou, Y., Cao, J., Sun, X., Fu, Y., Fang, X., Guo, X., Wang, B., Hou, R., Shen, F., Mu, B., Ni, P., Lin, R., Qian, W., Wang, G., Yu, C., Nie, W., Wang, J., Wu, Z., Liang, H., Min, J., Wu, Q., Cheng, S., Ruan, J., Wang, M., Shi, Z., Wen, M., Liu, B., Ren, X., Zheng, H., Dong, D., Cook, K., Shan, G., Zhang, H., Kosiol, C., Xie, X., Lu, Z., Zheng, H., Li, Y., Steiner, C.C., Lam, T.T.-Y., Lin, S., Zhang, Q., Li, G., Tian, J., Gong, T., Liu, H., Zhang, D., Fang, L., Ye, C., Zhang, J., Hu, W., Xu, A., Ren, Y., Zhang, G., Bruford, M.W., Li, Q., Ma, L., Guo, Y., An, N., Hu, Y., Zheng, Y., Shi, Y., Li, Z., Liu, Q., Chen, Y., Zhao, J., Qu, N., Zhao, S., Tian, F., Wang, X., Wang, H., Xu, L., Liu, X., Vinar, T., Wang, Y., Lam, T.-W., Yiu, S.-M., Liu, S., Zhang, H., Li, D., Huang, Y., Wang, X., Yang, G., Jiang, Z., Wang, J., Qin, N., Li, L., Li, J., Bolund, L., Kristiansen, K., Wong, G.K.-S., Olson, M., Zhang, X., Li, S., Yang, H., Wang, J., and Wang, J. 2009. The sequence and de novo assembly of the giant panda genome. Nature 463(7279): 311-317. Lister, R., Gregory, B.D., and Ecker, J.R. 2009. Next is now: new technologies for sequencing of genomes, transcriptomes, and beyond. Current Opinion in Plant Biology 12: 107-118. Litschko, L., and Burpee, L.L. 1987. Variation among isolates of Microdochium nivale collected from wheat and turfgrasses. Transactions of the British Mycological Society 89: 252-256. Mahuku, G.S., Hsiang, T., and Yang, L. 1998. Genetic diversity of Microdochium nivale isolates from turfgrass. Mycological Research 102: 559-567. Maurin, N., Rezanoor, H.N., Lamkadmi, Z., Some, A., and Nicholson, P. 1995. A comparison of biological, molecular, and enzymatic markers to investigate variability within Microdochium nivale (Fries) Samuels and Hallett. Agronomie 15(1): 39-47. Merino, S.T., Nelson, M.A., Jacobson, D.J., and Natvig, D.O. 1996. Pseudohomothallism and evolution of the mating-type chromosome in Neurospora tetrasperma. Genetics 143: 789-799. Mueller, v.E. 1977. Die systematische stellung des "Schneesschimmels". Revue de mycologie 41(129-134). Nagalakshimi, U., Wang, Z., Waern, K., Shou, C., Raha, D., Gerstein, M., and Snyder, M. 2008. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320: 1344-1340. Nakajima, T., and Naito, S. 1995. Reassessment of mycotoxin productivity of Microdochium nivale in Japan. Annals of the Phytopathological Society of Japan 61(4): 357-361. Nelson, D.L., and Cox, M.M. 2004. Lehninger Principles of Biochemistry, 4th edition. W.H. Freeman, New York. 20

Nelson, E.B. 1992. The Biological Control of Turfgrass Diseases, New York. Nicholson, P., Lees, A.K., Maurin, N., Parry, D.W., and Rezanoor, H.N. 1996. Development of a PCR assay to identify and quantify Microdochium nivale var nivale and Microdochium nivale var majus in wheat. Physiological and Molecular Plant Pathology 48(4): 257-271. Nowrousian, M., Stajich, J.E., Chu, M., Engh, I., Espagne, E., Halliday, K., Kamerewerd, J., Kempken, F., Knab, B., Kuo, H.-C., Osiewacz, H.D., Poeggeler, S., Read, N.D., Seiler, S., Smith, K.M., Zickler, D., Kück, U., and Freitag, M. 2010. De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora, a Model Organism for Fungal Morphogenesis. PLoS Genetics 6(4): e1000891. Parry, D.W., Rezanoor, H.N., Pettitt, T.R., Hare, M.C., and Nicholson, P. 1995. Analysis of Microdochium nivale isolates from wheat in the UK during 1993. Annals of Applied Biology 126(3): 449-455. Pirgozliev, S.R., Edwards, S.G., Hare, M.C., and Jenkinson, P. 2003. Strategies for the control of Fusarium head blight in cereals. European Journal of Plant Pathology 109(7): 731-742. Pöggeler, S. 2000. Two pheromone precursor genes are transcriptionally expressed in the homothallic ascomycete Sordaria macrospora. Current Genetics 37: 403-411. Pronczuk, M., and Messyasz, M. 1991. Infection ability of mycelium and spores of Microdochium nivale (Fr.) Samuels & Hallett to Lolium perenne L. Mycotoxin Research 7A. Qin, D., Wu, H., Peng, H., Yao, Y., Ni, Z., Li, Z., Zhou, C., and Sun, Q. 2008. Heat stress- responsive transcriptome analysis in heat susceptible and tolerant wheat (Triticum aestivum L.) by using Wheat Genome Array. BMC Genomics 9: 432-440. Ragsdale, N.N., and Sisler, H.D. 1994. Social and political implications of managing plant diseases with decreased availability of fungicides in the United States. Annual Review of Phytopathology 32: 545-557. Samuels, G.J., and Hallett, I.C. 1983. Microdochium stoveri and Monographella stoveri, new combinations for Fusarium stoveri and Micronectriella stoveri Transactions of the British Mycological Society 81: 473-483. Sanger, F., Air, G.M., Barrell, B.G., Brown, N.L., Coulson, A.R., Fiddes, J.C., Huttchinson, C.A.I., Slocombe, P.M., and Smith, M. 1977a. Nucleotide sequence of bacteriophage phi X174 DNA. Nature 265: 687-695. Sanger, F., Nicklen, S., and Coulson, A.R. 1977b. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences of the U.S.A. 74(12): 5463- 5467. Schoch, C.L., Sung, G.H., Lopez-Giraldez, F., Townsend, J.P., Miadlikowska, J., Hofstetter, V., Robbertse, B., Matheny, P.B., Kauff, F., Wang, Z., Gueidan, C., Andrie, R.M., Trippe, K., Ciufetti, L.M., Wynns, A., Fraker, E., Hodkinson, B.P., Bonito, G., Groenewald, J.Z., Arzanlou, M., de Hoog, G.S., Crous, P.W., Hewitt, D., Pfister, D.H., Peterson, K., Gryzenhout, M., Wingfield, M.J., Aptroot, A., Suh, S.O., Blackwell, M., Hillis, D.M., Griffith, G.W., Castlebury, L.A., Rossman, A.Y., Lumbsch, H.T., Lucking, R., Budel, B., Rauhut, A., Diederich, P., Ertz, D., Geiser, D.M., Hosaka, K., Inderbitzin, P., Kohlmeyer, J., Volkmann-Kohlmeyer, B., Mostert, L., O'Donnell, K., Sipman, H., Rogers, J.D., Shoemaker, R.A., Sugiyama, J., Summerbell, R.C., Untereiner, W., Johnston, P.R., Stenroos, S., Zuccaro, A., Dyer, P.S., Crittenden, P.D., Cole, M.S., Hansen, K., Trappe, J.M., Yahr, R., Lutzoni, F., and Spatafora, J.W. 2009. The Ascomycota Tree of Life: A 21

Phylum-wide Phylogeny Clarifies the Origin and Evolution of Fundamental Reproductive and Ecological Traits. Systematic Biology 58(2): 224-239. Shendure, J., and Ji, H. 2008. Next-generation DNA sequencing. Nature Biotechnology 26(10): 1135-1145. Shoemaker, R.A. 1981. Changes in taxonomy and nomenclature of important genera of plant pathogens. Annual Review of Phytopathology 19: 297-307. Shulaev, V., Sargent, D.J., Crowhurst, R.N., Mockler, T.C., Folkerts, O., Delcher, A.L., Jaiswal, P., Mockaitis, K., Liston, A., Mane, S.P., Burns, P., Davis, T.M., Slovin, J.P., Bassil, N., Hellens, R.P., Evans, C., Harkins, T., Kodira, C., Desany, B., Crasta, O.R., Jensen, R.V., Allan, A.C., Michael, T.P., Setubal, J.C., Celton, J.-M., Rees, D.J.G., Williams, K.P., Holt, S.H., Rojas, J.J.R., Chatterjee, M., Liu, B., Silva, H., Meisel, L., Adato, A., Filichkin, S.A., Troggio, M., Viola, R., Ashman, T.-L., Wang, H., Dharmawardhana, P., Elser, J., Raja, R., Priest, H.D., Bryant, D.W., Fox, S.E., Givan, S.A., Wilhelm, L.J., Naithani, S., Christoffels, A., Salama, D.Y., Carter, J., Girona, E.L., Zdepski, A., Wang, W., Kerstetter, R.A., Schwab, W., Korban, S.S., Davik, J., Monfort, A., Denoyes-Rothan, B., Arus, P., Mittler, R., Flinn, B., Aharoni, A., Bennetzen, J.L., Salzberg, S.L., Dickerman, A.W., Velasco, R., Borodovsky, M., Veilleux, R.E., and Folta, K.M. 2011. The genome of woodland strawberry (Fragaria vesca). Nature Genetics 43(2): 109-116. Simpson, D.R., Rezanoor, H.N., Parry, D.W., and Nicholson, P. 2000. Evidence for differential host preference in Microdochium nivale var. majus and Microdochium nivale var. nivale. Plant Pathology 49(2): 261-268. Smiley, R.W., Dernoeden, P.H., and Clarke, B.B. 2005. Compendium of Turfgrass Diseases, 3rd Ed. American Phytopathological Society, St. Paul, MN. Smith, J.D. 1983. Fusarium nivale (Gerlachia nivalis) from cereals and grasses - is it the same fungus? Canadian Plant Disease Survey 63(1): 25-26. Snider, C.S., Hsiang, T., Zhao, G.Y., and Griffith, M. 2000. Role of ice nucleation and antifreeze activities in pathogenesis and growth of snow molds. Phytopathology 90(4): 354-361. Snyder, W.C., and Nash, S.M. 1968. Relative incidence of Fusarium pathogens of cereals in rotation plots at Rothamsted. Transactions of the British Mycological Society 51: 417- 425. Taylor, J.W. 2011. One fungus = one name DNA and fungal nomenclature twenty years after PCR. IMA Fungus 2(2): 113-120. Taylor, J.W., Jacobson, D.J., and Fisher, M.C. 1999. The evolution of asexual fungi: reproduction, speciation, and classification. Annual Review of Phytopathology 37: 197- 246. Waalwijk, C., Kastelein, P., de Vries, I., Kerenyi, Z., van der Lee, T., Hesselink, T., Kohl, J., and Kema, G. 2003. Major changes in Fusarium spp. in wheat in the Netherlands. European Journal of Plant Pathology 109(7): 743-754. Walker, A.S., Auclair, C., Gredt, M., and Leroux, P. 2009. First occurrence of resistance to strobilurin fungicides in Microdochium nivale and Microdochium majus from French naturally infected wheat grains. Pest Management Science 65(8): 906-915. Wang, Z., Gerstein, M., and Snyder, M. 2009. RNA-Seq: a revolutionary tool for transcriptomics. Nature Reviews Genetics 10(1): 57-63. Webster, J., and Weber, R.W.S. 2007. Introduction to Fungi, 3rd edition. Cambridge University Press, New York. 22

Wilhelm, B.T., Marguerat, S., Watt, S., Schubert, F., Wood, V., Goodhead, I., Penkett, C.J., Rogers, J., and Bähler, J. 2008. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature 453: 1239-1243. Windels, C.E. 1999. Economic and social impacts of Fusaruim head blight: changing farms and rural communities in the northern Great Plains. Phytopathology 90(1): 17-21. Wollenweber, H.W. 1930. Fusaria Autographice Delineate, 2nd ed. Edited by H.W. Wollenweber, Berlin.

23

Figure 1.1 Pink snow mould damage on mixed creeping bentgrass / annual bluegrass green

(Agrostis stolonifera / Poa annua), Guelph, ON, 02 March 2012.

24

Chapter 2 Phylogenetic Analyses

2.1 Introduction

2.1.1 Fungal taxonomy and nomenclature Taxonomy describes the process of creating hierarchical groups of living things, whereas nomenclature is the system that governs the naming of the organisms and groups delineated by taxonomy (Lizon and Samuels 1997). Historically, fungi were identified and classified according to their morphology, especially that of the sexual spores and structures (Webster and Weber

2007). The first set of rules establishing the nomenclature of fungi were published in the late

19th century (de Candolle 1867) in what was to become the International Code of Botanical

Nomenclature (ICBN). Under the versions of the ICBN in force from 1905 until 2011, it was permitted to apply separate names to the sexual and asexual stages of the same organism (the teleomorph and anamorph name, respectively), and these different generic names were separately classified at higher ranks as well. Recent proposals for a more sensible one fungus - one name system (Hawksworth 2004; Taylor 2011) were adopted in the International Code of

Nomenclature for algae, fungi, and plants (Hawksworth 2004; McNeill et al. 2012), ending the dual nomenclature system that had been employed for fungi for over 100 years.

This reliance on knowledge of sexual structures for classification meant that species which were not known to reproduce sexually could not be classified with their closest relatives.

The phylum Deuteromycota was erected to house these asexual species (Webster and Weber

2007). However, the currently-accepted fungal phylogeny does not include the Deuteromycota because these fungi are asexual forms of Ascomycota or Basidiomycota (Hibbett et al. 2007).

Although morphological observations are still valuable for the identification of fungi, the

25

addition of molecular methods (see section 2.1.2) has revolutionized fungal taxonomy (Schoch et al. 2009).

2.1.2 Molecular phylogeny The use of DNA, rather than morphology, for taxonomic distinctions among the fungi was pioneered in the early 1990s (Hibbett et al. 2007) by White and colleagues (1990). The addition of molecular characters to the morphological and ecological observations that had been previously used to delineate fungi into taxa revealed the presence of cryptic species in several economically and environmentally important groups, including human pathogens (Koufopanou et al. 2001; Pringle et al. 2005), agricultural pests (Wang et al. 1998), and ecologically sensitive species (Bickford et al. 2007).

In recent years, the amount of genetic information available in public databases such as

GenBank (run by the National Center for Biotechnology Information; www.ncbi.nlm.nih.gov/) has increased rapidly due to technical improvements and the concurrent decrease in sequencing cost. While this enormous pool of data has been beneficial for researchers in many disciplines, the most effective ways to manage, analyze, and interpret this information have been contentious. For example, when sequence information for multiple genes is available for the taxa of interest, some researchers have questioned whether it is more effective to compile all available data into a single analysis, or if each genetic region should instead be considered separately. If these regions are analysed separately, conclusions about the dataset as a whole may be derived from the results each individual analysis (Gontcharov et al. 2004).The goal of such experiments, often using cladistic methods, is to discover the “true” tree which is in agreement with the evolutionary history of the organisms under study (Schuh and Brower 2009). However, except in

26

very rare cases, the true evolutionary history is unknown. It is therefore important to carefully consider the assumptions implicit in any analysis in order to ensure that the results are meaningful.

2.1.3 Genes used for molecular phylogeny in fungi The rDNA region has been of interest for phylogenetic study since the 1980's (Woese and

Olsen 1986). This region is highly appealing for taxonomic study for several reasons. First, it is of high copy number in the genome, which reduces the total amount of DNA necessary for successful amplification (Nilsson et al. 2008). The rDNA region consists of highly conserved regions (18s and 28s) interspersed with less-conserved regions (the internal transcribed spacer

[ITS] and intergenic spacer [IGS] regions), which facilitates the design of universal primers

(White et al. 1990) and provides several levels of taxonomic resolution (Nilsson et al. 2008).

Furthermore, at the present time, this region is the most extensively sequenced and studied in fungi (e.g. (Schoch et al. 2009)), providing an enormous pool of data for researchers to use in either fungal identification or for taxonomic studies (O'Brien et al. 2005). For example, the

International Barcode of Life project (http://www.barcodeoflife.org/) recently selected the ITS region as a universal marker region for the rapid identification of fungi (Schoch et al. 2012). In recent re-evaluations of the taxonomies of the fungi (Hibbett et al. 2007) and the Ascomycota

(Schoch et al. 2009), rDNA sequences were included as important sources of evidence for the proposed re-ordering of several major taxa.

Despite this frequent use, the rDNA region alone, especially the sometimes highly variable ITS region, may not be sufficient for low-level taxonomic distinctions. For example, the

95-97% sequence similarity proposed as the threshold for the distinction between species may

27

fall within the range of intraspecifc variation in some species (Nilsson et al. 2008; Wang et al.

2011). Furthermore, the ITS region of some taxa from different genera may be highly similar

(e.g. poae and Gaeumannomyces graminis) (data not shown). Supplemental information from ecology or morphology, or from additional genetic regions, is thus valuable in inter- and intraspecific distinction.

In addition to rDNA sequences such as 18S or 28S, protein-coding genes such as β- tubulin (Keeling et al. 2000; Myllys et al. 2002), RNA-polymerase binding proteins 1 and 2

(Hibbett et al. 2007; Schoch et al. 2009), elongation factor-1α (Schoch et al. 2009), glyceraldehyde-3-phosphate dehydrogenase (Myllys et al. 2002) and cytochrome C oxidase 1

(Damon et al. 2010; Seifert et al. 2007), have also been used extensively in molecular phylogeny among the fungi.

2.1.4 Tree-building algorithms The construction of phylogenetic trees provides a visual representation of a hypothesis describing the relationships between various taxa. The choice of algorithm used for the construction of a tree may affect the apparent relationships between the taxa under study. Several tree-building algorithms are currently in use, each with a different balance of strengths and weaknesses.

Neighbour joining (NJ) is an example of a distance matrix-based method, where an algorithm is used to calculate the relative distances between the taxa under study based on differences in their character states (eg. nucleic acid sequence) (Holder and Lewis 2003). In NJ, taxa are combined into a group if their combination decreased the overall tree length. This method is repeated until the relationships between all of the taxa have been computed (Saitou

28

and Nei 1987). Bootstrap analysis, wherein the dataset is re-sampled many times, is often conducted to estimate the statistical support for the divisions proposed by the tree (Felsenstein

1985). Neighbour-joining is generally a rapid and non-computationally demanding method

(Holder and Lewis 2003), but it is insensitive to homoplasy (Holder and Lewis 2003), and the resulting tree may be influenced by the order of the input sequences (Harrison and Langdale

2006; Schuh and Brower 2009).

In maximum likelihood (ML), a likelihood function is defined according to the number of sites present in the sequences. Evolutionary events such as transitions (changing from one pyrimidine or purine into another), transversions (changing from a purine into a pyrimidine or vice versa), and indels (the insertion or removal of a nucleotide), are assigned independent probabilities as they may occur at different rates (Holder and Lewis 2003). The likelihood function is evaluated such that the tree with the largest likelihood value – which minimizes

“unlikely” mutations such as indels at the expense of maximising “likely” mutations such as transitions –is the most likely to have produced the observed data, and is chosen as the best tree

(Li 1997). Given sufficient time and computational power, ML will find the "true" tree (Li

1997); however, ML is a computationally intensive method and it is not computationally feasible to evaluate all of the possible trees in any but the smallest datasets (Holder and Lewis 2003), which means that ML may fail to find the optimal tree.

In maximum parsimony (MP), the optimum tree is the one that requires the smallest number of evolutionary events to explain sequence differences at parsimony-informative sites

(Holder and Lewis 2003). When the sequences of interest have been aligned, parsimony- informative sites are those which possess at least two different nucleotides (or amino acids), each of which are present in at least two, but not all, of the sequences under study (Fitch 1977). The 29

minimum number of changes (transitions, transversions, or indels) required to explain the nucleotide changes across all of the sequences are recorded for that site. This process is repeated at each phylogenetically informative site in the sequence data to construct a tree (Li 1997).

Maximum parsimony is intermediate in computational demand between ML and NJ, but the results that it produces may be biased by long-branch attraction, wherein chance similarities between divergent sequences may result in them being incorrectly placed close together on the

MP tree (Holder and Lewis 2003). As in NJ and ML, bootstrapping should be used to estimate the statistical strength of each branch within a MP tree (Holder and Lewis 2003).

Despite their individual weaknesses, all three of the methods described are still in regular use for estimating the relationships between taxa (Hibbett et al. 2007; Schoch et al. 2009). A search of the google scholar database using the term phylog* and either "maximum likelihood",

"maximum parsimony" or "'neighbour joining' OR 'neighbor joining'" for articles published between 2000 and 2012 revealed that neighbour joining was the most commonly referenced method until 2009, when it was overtaken by maximum likelihood (Figure 2.1).

2.1.5 Genetic differences between and within M. nivale and M. majus As there are few characteristic morphological differences between M. nivale and M. majus (Gerlach 1982), a number of molecular markers have been described to study variability within and to differentiate between them. Maurin (1995) studied a diverse range of isolates from across Europe and from a number of plant hosts using a range of physical and biochemical characteristics including esterase enzyme profiles and RFLP profiles of the ITS rDNA.

Amplification of the ITS region followed by digestion with the restriction enzyme RsaI made a single cut in some of the isolates, including a reference sample of var. majus, while it failed to

30

cut other isolates, including the reference sample for var. nivale. The esterase profiles visualized on acrylamide gels provided another distinguishing feature: while esterase profiles of M. majus were generally similar, those of M. nivale displayed more heterogeneity. Primers specific to M. nivale and to M. majus were designed by Nicholson (1996) based on primers used for RAPD

PCR. These primers have failed to amplify North American isolates of M. nivale (Glynn et al.

2005; Jewell and Hsiang 2013) suggesting possible heterogeneity between the North American and European populations of this species.

To elucidate the genetic diversity and host specificity of M. nivale in south-eastern

Ontario, numerous isolates of M. nivale were examined by Mahuku (1998). Isolates were collected from Lolium perenne and P. pratensis in Guelph and from A. stolonifera (in Guelph and in Cambridge, both in Ontario, 25 km apart). All isolates were identified as M. nivale using the RsaI restriction assay, and digestion of the IGS region with the restriction enzymes Hae III,

Cfo I, and Mbo I identified 60 unique genotypes among the 100 isolates studied. Random amplification of polymorphic DNA (RAPD) PCR revealed the presence of 96 unique genotypes, and dendograms constructed from the RAPD data separated the isolates into four groups, corresponding to the population from which they had been initially collected. Interestingly, the populations from Guelph and Cambridge which shared the same host species were more similar to one another than to any of the other isolate groups from other host species, implying that modest host specialization may be present within turfgrass-colonizing M. nivale strains (Mahuku et al. 1998). The large number of unique genotypes observed suggests that sexual reproduction may occur in this species despite limited direct field evidence to support this claim. Furthermore, the large degree of intra-varietal heterogeneity observed in this study agrees with the assertion by

Maurin et al. (1995) that M. nivale isolates are genetically heterogeneous. 31

Glynn et al. (2005) examined a portion of the EF-1α coding region that was 838 base pairs in length for 15 isolates of M. nivale sensu lato by designing specific primers to amplify either M. nivale or M. majus. Sequencing of the resulting amplicons revealed 96% sequence similarity between M. nivale and M. majus, while the mean sequence similarities within each group were 99.7% and 99.8%, respectively. In addition to developing a useful identification tool, this study prompted these researchers to suggest that the subspecies of M. nivale sensu lato should be promoted to the species level of classification (Glynn et al. 2005). However, this assertion is based upon the analysis of a single gene; in contrast, a recent revision of general ascomycete taxonomy relied on six genes (Schoch et al. 2009). Literature published on M. nivale and M. majus since Glynn et al.’s assertion have not always accepted the elevation (eg. (Kaneko and Ishii 2009; Pociecha et al. 2010)). Additional studies using multiple genes to confirm this divergence would support the elevation of M. nivale and M. majus to distinct species.

2.1.6 Objectives The first major objective of this project was to determine whether an examination of four genetic regions would reveal differences between isolates of M. majus and M. nivale that are consistent with their elevation to species from varieties. Intra-specific variation was assessed by investigating isolates originating from both Europe and North America and, for M. nivale, for isolates originating from turfgrass and from wheat and from both continents. The utility of previously-published primers for distinguishing between M. nivale and M. majus was assessed.

32

2.2 Materials and Methods

2.2.1 Sample Collection Isolates of M. nivale and M. majus were from the local lab collection (preserved on frozen grain at -20°C), from the field, or from collaborators (Table 2.1). Field isolates were obtained from plants as follows: leaf blades were cut into small pieces no larger than 0.5 cm in length, rinsed in sterile distilled water (SDW), surface sterilized in a 1% NaOCl solution for 0.5–

5 minutes, rinsed in fresh SDW, and placed onto 2% potato dextrose agar (PDA, Becton,

Dickinson and Company, MD, USA) prepared according to the manufacturer’s instructions and amended with the antibiotics tetracycline (0.5 µg/mL) and streptomycin (1 µg/mL). The plates were stored at 10 °C in the dark for up to two weeks until fungal colonies were visible; specimens resembling M. nivale or M. majus were isolated by cutting a small plug of agar from the actively growing margin of the colonies and placing them in the centre of a Petri dish containing unamended 2% PDA. Several transfers were made and, where possible, single-spore isolates were obtained before subsequent experiments and for preparing stock tubes for the permanent collection.

Fungal isolates were identified on the basis of growth rate on PDA (0.13-0.37 mm/h at 20

°C) (Litschko and Burpee 1987), colony morphology (salmon-pink mycelium), and, where available, conidial morphology (hyaline, 0-3 septa with width < 4.7 µm, length 8-27 µm for M. nivale and hyaline, 1-7 septa with width > 4.5 µm, length 15-33 µm for M. majus) (Glynn et al.

2005). All fungal isolates were maintained on 2% PDA and stored at 4 °C. For long-term storage, 15 mL vials containing thrice-autoclaved wheat seeds were inoculated with a fungal isolate, or agar plugs were placed into 15 mL vials containing SDW (5 mL). One vial each of water and wheat-seed were stored at 4 °C while two additional wheat vials were stored at -20 °C.

33

A total of 62 isolates were included in these experiments (Table 2.1).

2.2.2 DNA extraction Fungal DNA was obtained by growing each isolate on PDA overlaid with a 6 cm x 6 cm cellophane sheet (Flexel Inc., Atlanta, Ga., U.S.A.) presterilized by autoclaving three times in dH2O (20 minutes, 121°C). After 5 days, mycelium was harvested by scraping it from the cellophane (avoiding the initial agar plug), and collecting in a 1.5 mL tube. Mycelium was stored at -20 °C until DNA extraction (Edwards et al. 1991). For each DNA extraction, 100 mg of either fresh or frozen mycelium was placed in a 1.5 mL tube containing approximately 50 mg of acid washed and autoclaved sea sand (Fisher, Fair NJ, USA) and 200 µL Edwards buffer

(Tris HCl (pH 7.5) 200 mM, NaCl 250 mM, EDTA 25 mM, SDS 0.5% (w/v)). The mycelium was ground with a plastic pestle (Froggabio, Toronto, Canada) using a Mastercraft Lithium Ion screwdriver (3.6V, Canada) for between 60 to 120 s to disrupt the cells. A further 200 µL of

Edwards buffer was added and the extraction mixture was incubated at room temperature for 2 -

3 h, after which it was centrifuged at 12,000 x g in an Eppendorf 5415 D centrifuge (Eppendorf,

Mississauga, Canada) for 10 min. The supernatant was transferred into a fresh 1.5 mL tube and an equal volume of -20 °C isopropanol was added to precipitate the DNA. The tube was incubated at -20 °C for at least 1 h. The tube was then spun in the centrifuge at 12,000x g for 10 min to pellet the DNA. The supernatant was discarded and the pellet was washed with 200 µL of cold 70% ethanol. The ethanol was discarded and the pellet was dried by inverting the tube and incubating at room temperature for 10 min. The DNA was then resuspended in 100 µL of PCR water (nuclease free water, Fisher Scientific, USA) by gently pipeting several times. The tubes

34

were incubated at 4 °C for at least 1 h to allow the dissolution of the DNA before storage at -20

°C.

DNA quality and quantity were assessed by subjecting the samples to electrophoresis through a 1% agarose gel (UltraPureTM Agarose, Invitrogen, Carlsbad, CA, USA) prepared using

0.5X TBE buffer (90 mM Tris base, 90mM boric acid, 2mM EDTA). A 3 µL mixture (2:1) of

DNA and loading buffer (0.1% w/v bromophenol blue, 0.1% w/v xylene cyanol, 30% v/v/ glycerol, and 60 mM EDTA) was loaded onto the gel and subjected to electrophoresis in 0.5X

TBE buffer at 100 V for approximately 30 minutes before being stained with ethidium bromide and visualized under UV light. The size and approximate concentration of the amplicons were calculated by comparison to a molecular weight and mass ladder loaded in every gel (GeneRuler

100bp DNA ladder, Fermentas, Canada).

2.2.3 Primer design and selection To examine the genetic regions included in this study, previously published fungal primers were tested in M. nivale and M. majus (Table 2.2) using the PCR protocol suggested in the original journal article where the primers were designed. In cases where these primers failed to yield consistent results among all of the isolates examined, new primers were designed for each particular region.

PCR primers were designed according to the following general protocol. Sequences from the gene of interest were downloaded from the NCBI GenBank database. When possible, sequences were selected from Microdochium species; when no Microdochium sequences were available, sequences from other taxonomically related species were chosen. The sequences were aligned (section 2.2.6) and were visualized in BioEdit. Candidate primer sequences were selected

35

within the conserved regions predicted by the alignment. The candidate primers were assessed using both Primer3 v. 2.2.2 (Rozen and Skaletsky 2000) and OligoCalc (Kibbe 2007). Primers that were between 16-22 bp in length, possessed a GC content between 40-70%, had a predicted melting temperature between 48-64 °C, and which were not predicted to possess self- complementarity or to fold into hairpin structures were selected as strong candidates. Primer pairs that were to be used in a single reaction were further screened to ensure that their predicted melting temperatures were within 2 °C and that they were unlikely to form dimer structures. All primers included in these experiments were ordered from Laboratory Services Division,

University of Guelph.

2.2.4 PCR protocols and sequencing All PCR reactions were optimized to yield single amplicons. The general PCR procedure utilized was as follows: approximately 1 ng (1μL) of fungal DNA was added to a 0.2 mL PCR tube containing a pre-mixed solution consisting of 1X PCR buffer, either 0.2 mM (ITS, EF-1α) or 0.4 mM (β-tubulin, RPB2) Mg2+, 2.5 mM dNTP mixture (prepared using individual solutions of dATP, dTTP, dCTP, and dGTP; BioBasic, Markham, ON), 0.5 μM of each the forward and reverse primers, 0.04 U of Tsg DNA polymerase (BioBasic, Markham, ON), and enough sterile water to bring the total volume (including DNA) to 15 μL. The reaction mixture was subjected to a thermal cycling procedure. For ITS, the thermal cycling protocol consisted of five minutes at

95 °C, followed by 30 cycles of a 30 s denaturation at 95 °C, a 45 s annealing at 55 °C, and a 60 s extension at 72 °C, followed by a final 10 minute extension at 72 °C. For RPB2, the thermal cycling protocol consisted of five minutes at 95 °C, followed by 35 cycles of a 30 s denaturation at 95 °C, a 45 s annealing at 58 °C, and a 90 s extension at 72 °C, followed by a final 10 minute

36

extension at 72 °C. For β-tubulin, the thermal cycling protocol consisted of five minutes at 95

°C, followed by 35 cycles of a 30 s denaturation at 95 °C, a 45 s annealing at 53 °C, and a 90 s extension at 72 °C, followed by a final 10 minute extension at 72 °C. For EF-1α ,the thermal cycling protocol consisted of 75 s at 95 °C, followed by 35 cycles of a 30 s denaturation at 95

°C, a 15 s annealing at 52 °C, and a 45 s extension at 72 °C, followed by a final 4.3 minute extension at 72 °C. For Y13, the thermal cycling protocol consisted of five minutes at 95 °C, followed by 35 cycles of a 30 s denaturation at 95 °C, a 60 s annealing at 45 °C, and a 45 s extension at 72 °C, followed by a final 5 minute extension at 72 °C.

The PCR products were visualized as described above except that PCR products larger than

1kb were subjected to electrophoresis at 50 V for 60 minutes before being stained with an ethidium bromide solution and visualized under UV light. The PCR products were sequenced in the forward direction only by Laboratory Services, University of Guelph (Guelph, Ontario,

Canada). Sequencing was performed on a GeneAmp® PCR System 9700 or 2720 Thermal

Cycler (Applied Biosystems).

2.2.5 Sequence alignments and trees Chromatogram files from sequencing reactions were visualized using BioEdit Sequence

Alignment Editor software v.7.0.53 (Hall 1999), and base calls were corrected manually as necessary. Within each genomic region studied, the longest sequence common to all of the sequenced amplicons obtained was used for alignment. Alignments were performed using the default parameters with ClustalX v. 2.0.12 (Chenna et al. 2003). Neighbour-Joining (NJ),

Maximum Likelihood (ML), and Maximum Parsimony (MP) trees were constructed using

PAUP* v. 4.0 b. 10 (Swofford 2000) using a heuristic search with random addition of sequences

37

and 1,000 bootstrap replicates. For the maximum likelihood calculations, a transition: transversion ratio of 2:1 was used. For the Neighbour-Joining analysis, the HKY85 model was used. For the RPB2, β-tubulin, and ITS regions, DNA sequences from North American isolates of Microdochium bolleyi from our isolate collection were used as outgroups. In the EF-1α analysis, a sequence from Hypocrea lixii (DQ056745.1) was used as an outgroup due to difficulty in amplifying the same portion of the EF-1α gene from M. bolleyi that was analysed for

M. nivale and M. majus.

2.2.6 Restriction Digests A restriction digest of the RPB2 PCR product was developed using the restriction enzyme

HindIII. Following amplification with the RPB2 primers RPB150F and fRPB2-7cR, the 1,040 bp amplicon was digested by incubating 8 μL of the PCR product for 2 hours at 32°C in a solution containing 0.02 U HindIII restriction enzyme (Invitrogen, Carlesbad, CA), 1x restriction buffer provided by the enzyme's manufacturer, and enough sterile dH2O to bring the final volume of the solution to 40 μL. Following incubation, the reaction mixture was heated to 72 °C for 10 minutes to inactivate the enzyme. The digested samples were visualized as described for the PCR samples.

2.3 Results

2.3.1 Primer testing and design At the beginning of these experiments, published primer sequences were tested to determine whether the regions of interest could be consistently amplified from a range of isolates. For both ITS and EF-1α, the published primers produced amplicons of the predicted

38

sizes without optimization. For RPB2, the literature primer set fRPB2-7cF / fRPB2-7cR was tested with 15 isolates of M. nivale and 3 isolates of M. majus. Following optimization of the

PCR protocol, two of the three M. majus isolates tested were successfully amplified, but only 3 of the 15 M. nivale isolates tested produced a band of the predicted size (1,500 bp). The remaining 12 M. nivale and the other M. majus isolate failed to yield any amplicon.

The RPB2 amplicon from one of the M. majus isolates was submitted for sequencing in the forward direction (LSD) and a 917 bp sequence was returned. When searched against the

GenBank database, the top match was the RPB2 sequence from Pseudomassaria carolinensis

(Order Xylariales, Family Nectriaceae; GenBank accession DQ810239.1). A total of 20 RPB2 sequences were collected from GenBank (Table 2.3) and were aligned with the putative RPB2 sequence from M. majus to design a new forward primer. Following optimization, this primer, when paired with fRPB2-7cR, yielded a single band of the predicted size from all of the M. nivale and M. majus isolates tested.

Similarly, the β-tubulin primers Bt1b / Bt2a (Glass 1995) were tested with a total of five

M. majus and six M. nivale isolates. Following optimization of the PCR protocol, among the M. majus isolates, two out of five were successfully amplified, while the remaining three isolates yielded multiple bands. Among the M. nivale isolates, only two out of seven were successfully amplified; the remaining isolates failed to amplify under the conditions tested. Two M. majus isolates were sent for sequencing in the forward direction (LSD) and yielded partial gene sequences that were 881 (isolate 99027) and 888 bp (isolate 99049) in length. For both sequences, the top match when searched against the GenBank database (blastn) was the β-tubulin gene from Pestalotiopsis paeoniicola (Order Xylariales, Family ; accession

FJ975603.1). These M. majus sequences were then aligned with 19 β-tubulin sequences (Table 39

2.4) to design new primers. Following optimization, this primer set (Btub526F / Btub1332R) successfully amplified a sequence of the predicted length from all of the M. nivale and M. majus isolates with which they were tested.

2.3.2 Sequence differences between Microdochium nivale and M. majus Following PCR protocol optimization and primer redesign as necessary, the genomic regions studied were successfully amplified in all of the isolates included in these experiments, with the exception described by Glynn et al. (2005) that M. majus isolates were only amplified by the EFMajF/EFMicR primer pair, and M. nivale isolates were only amplified by the

EFNivF/EfMicR primer pair. These were specific primers designed to amplify the different taxa separately (Glynn et al. 2005). The Y13N primer set designed to amplify M. nivale but not M. majus (Nicholson et al. 1996) also failed to amplify some M. nivale isolates (described below).

All of the sequences generated in these analyses have been deposited in the NCBI GenBank database (accession numbers JX280526-JX280607).

The Y13N and Y13M primer sets were tested with a set of European and North

American strains of M. nivale and M. majus (Table 2.1). Among the M. majus isolates tested, nine out of nine of the European isolates, and six out of six of the North American isolates tested were successfully amplified. Among the M. nivale isolates, 12 of 13 of the European isolates tested and 6 of 22 of the North American isolates tested were amplified (Figure 2.2).

For RPB2, the primer set RPB150F / fRPB2_7cR produced a single amplicon of 1,040 bp in length for all 18 of the samples tested (Table 2.1). These amplicons were sequenced in the forward direction only using the primer RPB150_F, yielding raw sequences that varied between

970-1180 bp in length. The alignments were performed using a 729 bp region common among

40

all of the sequences obtained (Appendix 2.3). There were 62 parsimony-informative characters,

581 constant characters, and 86 parsimony-uninformative characters. The six M. majus isolates grouped together into a single clade with bootstrap values of 100% in all three analyses performed. The 12 M. nivale isolates also grouped together into a single clade with bootstrap values of 100% in the MP and NJ trees, and 97% in the ML tree (Figure 2.3).

The HindIII digest of the RPB2 isolates confirmed the presence of these consistent sequence differences in all of the M. nivale and M. majus isolates tested. While M. nivale isolates were cut two times to produce fragments that were 125, 444, and 471 bp in length, the M. majus isolates were cut only once, producing fragments that were 444 and 596 bp in length (Figure 2.4). An alignment showing the cut sites for this restriction enzyme is found in Appendix 2.3.

For the β-tubulin sequences, a single amplicon of 767 bp was obtained for all 18 isolates tested using the primer pair Btub526F and Btub1332R, and these amplicons were sequenced with the forward primer yielding raw sequences that varied between 723 to 814 bp in length. The alignments were performed using a 720 bp fragment common among all of the sequences obtained (Appendix 2.4). There were 55 parsimony-informative characters, 651 constant characters, and 14 parsimony-uninformative characters. The M. majus isolates tested formed a single clade with bootstrap values of 100, 91, and 99% in the NJ, ML, and MP trees. The M. nivale isolates tested formed a single clade with 100% bootstrap support in the NJ and MP trees, and 98% in the ML tree (Figure 2.3).

For EF-1α, a single amplicon of either 491 (M. nivale) or 487 (M. majus) bp in length was obtained for all 18 isolates tested, and these amplicons were sequenced in the forward direction using either EFMajF for M. majus or EFNivF for M. nivale isolates. The raw sequences obtained ranged between 450 to 491 bp in length, and a 423 bp region common to all isolates 41

was aligned for analysis (Appendix 2.5). There were 4 parsimony-informative characters, 387 constant characters, and 38 parsimony-uninformative characters. The M. majus isolates tested formed a single group with bootstrap support of 87, 60, and 74 % in the NJ, ML, and MP trees.

The M. nivale isolates tested formed a single group with bootstrap support of 92, 89, and 93% in

NJ, ML, and MP trees (Figure 2.3).

For ITS, a single amplicon 530 bp in length was obtained for the 20 isolates tested, and these amplicons were sequenced in the forward direction using the primer ITS1. The raw sequences obtained ranged between 503 to 530 bp in length, and a 503 bp region common to all isolates was aligned for analysis (Appendix 2.6). There were 21 parsimony-informative characters, and 482 constant characters. All variable characters were parsimony-informative.

Only the NJ analysis grouped the M. majus isolates (50% bootstrap support) and the M. nivale isolates (99% bootstrap support) into distinct clades (Figure 2.3).

2.3.3 Geographic and host-specific differences For RPB2, the European M. majus isolates formed distinct clades, with NJ, ML, and MP bootstrap values of 99, 100, and 100%, respectively. The North American isolates did not form a single cluster in any of the analyses. Among the M. nivale isolates, the European and North

American isolates did not form distinct clades, but all isolates from turf formed a group with bootstrap values of 88, 90, and 96%, as did the wheat isolates with bootstrap values of 85% in the NJ and 73% in both the ML and MP trees.

For β-tubulin, among the M. nivale isolates tested, all European and North American samples isolated from wheat formed a clade with 81, 73, and 77% bootstrap support in the NJ,

ML, and MP trees. The M. nivale isolates from turf formed a single group with 53, 54, and 59%

42

bootstrap support in the NJ, ML, and MP trees. No distinction could be made between isolates from North America and those from Europe in any of the analyses with the partial β-tubulin sequences. Neither the ITS nor the partial EF-1α sequences resolved groups based on geographic or host origin for either M. majus or M. nivale but only a limited number of isolates were tested here. A greater number of representatives from each geographic or host origin might reveal some other relationships.

2.4 Discussion

In this Chapter, the nucleotide sequences of four genomic regions were used to explore the differences between and among the sister species Microdochium nivale and M. majus.

Differences between European and North American isolates were investigated by studying an equal number of isolates from both continents. Within M. nivale, which has both a larger host range and a higher range of reported genetic diversity (Maurin et al. 1995), isolates from both turfgrasses and wheat were examined to assess whether differences due to host plant origin could be identified.

All three of the protein-coding regions examined (RPB2, β-tubulin, and EF-1α) resolved all of the M. majus and M. nivale isolates studied into separate clades with strong bootstrap support. These specific groupings were not resolved with ITS. The ITS region also failed to resolve any of the other clades supported by the other genomic regions studied. Despite the apparent inability of ITS to distinguish between M. nivale and M. majus, digestion of the ITS amplified region with RsaI has been shown to produce a single cut for M. majus isolates, while

M. nivale is uncut (Maurin et al. 1995). Examination of the sequences obtained in this study

43

found that the presence or absence of this restriction site as reported by other researchers was conserved in all of the European and North American isolates sequenced (Appendix 2.6).

The inability of ITS sequences to resolve M. nivale and M. majus into separate clades using various tree-building methods supports the assertion that ITS may not be useful for all species-level fungal phylogenies. Concerted evolution of the multiple copies of this sequence within each individual’s genome may slow the rate of change of this sequence and maintain conservation between closely-related taxa, although the presumed lack of selective pressure allows this region to accumulate mutations between more distantly-related groups (Elder 1995).

Furthermore, due to the high copy number of ITS within a genome, direct sequencing of a PCR product without cloning may yield polymorphic results, with clear variability within the chromatogram reflecting site ambiguity. The sequence thus obtained is essentially a consensus sequence that probably reflects the most common of the variable characters, but which may also obscure real differences between sister species that have been separated for only a short evolutionary time.

The EF-1α region, used as direct evidence to support the elevation of M. nivale and M. majus to sister species by Glynn et al. (2005), resolved these taxa into distinct clades in this study. In the paper by Glynn and colleagues, two separate groups of primers were designed: the

EF1/F and EF1/R set, which produced an amplicon of approximately 840 bp from both M. nivale and M. majus, and the EFNivF, EFMajF, and EFMicR primers, which, together, amplified fragments of approximately 430 bp from either M. nivale or M. majus exclusively, and a separate set of primers, (2005). This latter set of primers was used to produce a 430 bp fragment of EF-1α in this study. The amplicon contained only four parsimony-informative positions and did not resolve further sub-divisions, such as geographical regions. However, Glynn et al. (2005) used a 44

832 bp amplicon (produced by the EF1/F and EF1/R primers) of which 56 of the 77 phylogenetically informative positions they identified were found outside of the region sequenced in this study. It is thus possible that the trends observed for RPB2 and β-tubulin may also be supported by larger portions or the full EF-1α sequence. The EF-1α primers used in this study did not amplify DNA from M. bolleyi despite several attempts to optimize the reaction.

This result suggests that nucleotide-level differences exist within the EF-1α regions of

Microdochium species. This hypothesis is further explored in Chapter 3.

Among the M. nivale isolates, both RPB2 and β-tubulin supported the resolution of distinct clades for all of the isolates originally collected on wheat, regardless of their geographic origin. These genes also supported a distinct clade for isolates originally collected from bentgrasses (Agrostis spp.) in five out of the six analyses performed. None of the analyses supported a distinct clade for either the North American or the European M. nivale isolates. All

M. majus isolates from North America were collected in one geographical area near Atwood,

Ontario, but the β-tubulin tree demonstrates that these isolates were not identical in sequence.

Despite the fact that the North American M. majus samples were collected in close proximity, none of the sequences were identical for any of the four genetic regions examined. Similarly, multiple isolates of M. majus were chosen for analysis (rather than the same three strains for all analyses) to examine the diversity of the M. majus isolates in our collection. The presence of genetic variation within M. majus, in addition to reports of perithecia (Parry et al. 1995), suggests that sexual recombination occurs for this species in the field.

Together, these results suggest that host specialization has occurred within M. nivale, which exhibits a wider host range than M. majus. This finding is concordant with the observation that M. nivale isolates collected across Europe from a variety of plant hosts were genetically 45

heterogeneous (Maurin et al. 1995). Similarly, the RFLP profiles of isolates of M. nivale sensu stricto from a single turfgrass host were more similar to those obtained for isolates from a different geographic location than they were to those of isolates from a different turfgrass species

(Mahuku et al. 1998). High genetic diversity was also observed within the populations studied

(Mahuku et al. 1998), suggesting that the differences observed between populations cannot be explained only by ecological separation. Sexual reproduction, which has not been explicitly observed in the field for M. nivale (Smith 1992), but which has been observed in vitro (Lees et al. 1995) and for which there is indirect evidence in the way of linkage disequilibrium calculations (Mahuku et al. 1998), may explain these observations.

For M. majus, the RPB2 analyses alone supported a group containing all of the European isolates studied, but the North American isolates were not grouped together. However, some differences were observed between the North American and European M. nivale isolates using the Y13NF/Y13NR primer set. Both the Y13N and the and Y13M primer sets are reported to discriminate between M. nivale and M. majus by selectively amplifying only M. nivale (Y13N) or M. majus (Y13M). Both sets of Y13 primers are RAPD primers amplifying genomic regions of unknown function. The original publication describing these primers found that the Y13N pair failed to amplify the single North American M. nivale isolate that was tested (Nicholson et al.

1996). All of the M. majus isolates tested in this study, regardless of their geographic origin, were successfully amplified by the Y13M primer set. However, 16 of the 22 North American M. nivale isolates tested failed to amplify with the Y13N primer set, whereas 12 of the 13 European isolates tested under the same conditions were amplified successfully. These results demonstrate not only the high level of sequence variability within M. nivale, but also the necessity of testing

46

screening primers of this type with the widest variety of isolates possible to ensure that consistent results can be obtained. This problem is further explored in Chapter 3.

Combined with the results from other analyses, these data suggest that there is evidence for geographic specialization within both M. nivale and M. majus. For M. nivale, host species specialization may have a larger effect on genetic diversity than geographic location. To address this hypothesis, whole-genome sequencing of M. majus isolates from a wider range of plant hosts would be informative. Relative to M. nivale, M. majus has previously displayed a small amount of genetic heterogeneity regardless of host plant origin (Maurin et al. 1995), so the apparent lack of geographic specialization is in concordance with these observations.

Among the four genetic regions studied, β-tubulin and RPB2 were found to be more phylogenetically informative than either ITS or EF-1α, all four of which are commonly used in multi-gene phylogenies. β-tubulin has previously been used to study closely related sister species

(Myllys 2001), and resolved M. nivale and M. majus into distinct clades in this study. In a recent multi-gene analysis that discovered cryptic species within the Neofusicoccum parvum/N. ribis species complex, RPB2 was also found to contain the largest number of phylogenetically informative characters (Pavlic 2009), which is also consistent with the 2009 assertion of Scoch et al. that, among a set of six genes (three protein-coding and three rDNA sequences), RPB2 was the most phylogenetically useful gene studied.

Overall, the data described in this Chapter support the reclassification of M. nivale and

M. majus as distinct species. The hypothesis that there may be distinct sub-groups within M. nivale and M. majus were also supported, but additional isolates of each species should be examined before the presence or absence of distinct varieties within these species can be determined. To further explore the level of variation within and between M. nivale and M. majus, 47

the whole-genome sequences of three isolates (two M. nivale and one M. majus) were obtained

(Chapter 3).

48

2.5 References for Chapter 2

Bickford, D., Lohman, D.J., Sodhi, N.S., Ng, P.K.L., Meier, R., Winker, K., Ingram, K.K., and Das, I. 2007. Cyptic species as a window on diversity and conservation. Trends in Ecology & Evolution 22: 148-155. Chenna, R., Sugawara, H., Koike, T., Lopez, R., Gibson, T.J., Higgins, D.G., and Thompson, J.D. 2003. Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Research 31(13): 3497-3500. Damon, C., Barroso, G., Ferandon, C., Ranger, J., Fraissinet-Tachet, L., and Marmeisse, R. 2010. Performance of the COX1 gene as a marker for the study of metabolically active Pezizomycotina and Agaricomycetes fungal communities from the analysis of soil RNA. FEMS Microbiology Ecology 74(3): 693-705. de Candolle, A.P. 1867. Lois de la nomenclature botanique adoptées par le Congrès International de Botanique tenu à Paris en août 1867 suivies d'une deuxième édition de l'introduction historique et du commentaire qui accompagnaient la rédaction préparatoire présentée à la congrès. J.-B. Baillière et fils, Paris. Edwards, K., Johnstone, C., and Thompson, C. 1991. A simple and rapid method for the preparation of plant genomic DNA for PCR analysis. Nucleic Acids Research 19: 1349. Elder, J.F., and Turner, B.J. 1995. Concerted evolution of repetitive DNA sequences in eukaryotes. The Quarterly Review of Biology 70: 297-320. Felsenstein, J. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39: 783-791. Fitch, W.M. 1977. On the problem of discovering the most parsimonious tree. The American Naturalist 111: 223-257. Gerlach, W., and Nirenberg, H. 1982. The genus Fusarium, a Pictorial Atlas. Mitteilungen aus der Biologischen Bundesanstalt fur Land-und Fortswirtschaft, Berlin-Dahlem 209: 406 pp. Glass, N.L., and Donaldson, G.C. 1995. Development of primer sets designed for use with the PCR to amplify conserved genes from filamentous ascomycetes. Applied and Environmental Microbiology 61(4): 1323-1330. Glynn, N.C., Hare, M.C., Parry, D.W., and Edwards, S.G. 2005. Phylogenetic analysis of EF-1 alpha gene sequences from isolates of Microdochium nivale leads to elevation of varieties majus and nivale to species status. Mycological Research 109: 872-880. Gontcharov, A.A., Marin, B., and Melkonian, M. 2004. Are combined analyses better than single gene phylogenies? A case study using SSU rDNA and rbcL sequence comparisons in the Zygnematophyceae (Streptophyta). Molecular Biology and Evolution 21: 612-624. Hall, T.A. 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Symp. Ser. 41: 95-98. Harrison, C.J., and Langdale, J.A. 2006. A step by step guide to phylogeny reconstruction. The Plant Journal 45: 561-572. Hawksworth, D.L. 2004. Limitation of dual nomenclature for pleomorphic fungi. Taxon 53(2): 596-598. Hibbett, D.S., Binder, M., Bischoff, J.F., Blackwell, M., Cannon, P.F., Eriksson, O.E., Huhndorf, S., James, T., Kirk, P.M., Lücking, R., Thorsten Lumbsch, H., Lutzoni, F.o., Matheny, P.B., McLaughlin, D.J., Powell, M.J., Redhead, S., Schoch, C.L., Spatafora, J.W., 49

Stalpers, J.A., Vilgalys, R., Aime, M.C., Aptroot, A., Bauer, R., Begerow, D., Benny, G.L., Castlebury, L.A., Crous, P.W., Dai, Y.-C., Gams, W., Geiser, D.M., Griffith, G.W., Gueidan, C.c., Hawksworth, D.L., Hestmark, G., Hosaka, K., Humber, R.A., Hyde, K.D., Ironside, J.E., Kõljalg, U., Kurtzman, C.P., Larsson, K.-H., Lichtwardt, R., Longcore, J., Miądlikowska, J., Miller, A., Moncalvo, J.-M., Mozley-Standridge, S., Oberwinkler, F., Parmasto, E., Reeb, V.r., Rogers, J.D., Roux, C., Ryvarden, L., Sampaio, J.P., Schüßler, A., Sugiyama, J., Thorn, R.G., Tibell, L., Untereiner, W.A., Walker, C., Wang, Z., Weir, A., Weiss, M., White, M.M., Winka, K., Yao, Y.-J., and Zhang, N. 2007. A higher-level phylogenetic classification of the Fungi. Mycological Research 111(5): 509-547. Holder, M., and Lewis, P.O. 2003. Phylogeny estimation: traditional and Bayesian approaches. Nature Reviews Genetics 4(275-284). Jewell, L.E., and Hsiang, T. 2013. Multigene differences between Microdochium nivale and Microdochium majus. Botany 91(2): 99-102. Kaneko, I., and Ishii, H. 2009. Effect of azoxystrobin on activities of antioxidant enzymes and alternative oxidase in wheat head blight pathogens Fusarium graminearum and Microdochium nivale. Journal of General Plant Pathology 75(5): 388-398. Keeling, P., J., Luker, M.A., and Palmer, J.D. 2000. Evidence from beta-tubulin phylogeny that Microsporidia evolved from within the fungi. Molecular Biology and Evolution 17(1): 23-31. Kibbe, W.A. 2007. OligoCalc: an online oligonucleotide properties calculator. Nucleic Acids Research 35: W43-W46. Koufopanou, V., Burt, A., Szaro, T., and Taylor, J.W. 2001. Gene genealogies, cryptic species, and molecular evolution in the human pathogen Coccidioides immitis and relatives (Ascomycota, Onygenales). Molecular Biology and Evolution 18: 1246-1258. Lees, A.K., Nicholson, P., Rezanoor, H.N., and Parry, D.W. 1995. Analysis of variation within Microdochium nivale from wheat - evidence for a distinct subgroup. Mycological Research 99: 103-109. Li, W.-H. 1997. Molecular Evolution. Sinauer Associates, Inc., Sunderland, MA. Litschko, L., and Burpee, L.L. 1987. Variation among isolates of Microdochium nivale collected from wheat and turfgrasses. Transactions of the British Mycological Society 89: 252-256. Liu, Y.L., Whelen, S., and Hall, B.D. 1999. Phylogenetic relationships among ascomycetes: evidence from an RNA polymerase II subunit. Molecular Biology and Evolution 16: 1799-1808. Lizon, P., and Samuels, G. 1997. Taxonomy and nomenclature defined. In Inoculum. Mycological Society of America. Mahuku, G.S., Hsiang, T., and Yang, L. 1998. Genetic diversity of Microdochium nivale isolates from turfgrass. Mycological Research 102: 559-567. Maurin, N., Rezanoor, H.N., Lamkadmi, Z., Some, A., and Nicholson, P. 1995. A comparison of biological, molecular, and enzymatic markers to investigate variability within Microdochium nivale (Fries) Samuels and Hallett. Agronomie 15(1): 39-47. McNeill, J., Barrie, F.R., Buck, W.R., Demoulin, V., Greuter, W., Hawksworth, D.L., Herendeen, P.S., Knapp, S., HMarhold, K., Prado, J., Prud'homme van Reine, W.F., Smith, G.F., and Wiersema, J.H. 2012. International Code of Nomenclature for algae, fungi, and plants (Melbourne Code), Adopted by the Eighteenth International Botanical 50

Congress Melbourne, Australia, July 2011. International Association for Plant Taxonomy, Bratislava. Myllys, L., Lohtander, K., and Tehler, A. 2001. Beta-tubulin, ITS, and group I intron sequences challenge the species pair concept in Physcia aipolia and P. caesia. Mycologia 93(2): 335-343. Myllys, L., Stenroos, S., and Thell, A. 2002. New genes for phylogenetic studies of lichenized fungi: glyceraldehyde-3-phosphate dehydrogenase and beta-tubulin genes. Lichenologist 34(4): 237-246. Nicholson, P., Lees, A.K., Maurin, N., Parry, D.W., and Rezanoor, H.N. 1996. Development of a PCR assay to identify and quantify Microdochium nivale var nivale and Microdochium nivale var majus in wheat. Physiological and Molecular Plant Pathology 48(4): 257-271. Nilsson, R.H., Kristiansson, E., Ryberg, M., Hallenberg, N., and Larsson, K.-H. 2008. Intraspecific ITS variability in the Kingdom Fungi as expressed in the international sequence databases and its implications for molecular species idenfication. Evolutionary Bioinformatics 4: 193-201. O'Brien, H.E., Parrent, J.L., Jackson, J.A., Moncalvo, J.-M., and Vilgalys, R. 2005. Fungal community analysis by large-scale sequencing of environmental samples. Applied and Environmental Microbiology 71(9): 5544-5550. Parry, D.W., Rezanoor, H.N., Pettitt, T.R., Hare, M.C., and Nicholson, P. 1995. Analysis of Microdochium nivale isolates from wheat in the UK during 1993. Annals of Applied Biology 126(3): 449-455. Pavlic, D., Slippers, B., Coutinho, T.A., and Wingfield, M.J. 2009. Molecular and phenotypic characterization of three phylogenetic species discovered within the Neofusicoccum parvum / N. ribis complex. Mycologia 101(5): 636-647. Pociecha, E., Plazek, A., Rapacz, M., Niemczyk, E., and Zwierzykowski, Z. 2010. Photosynthetic activity and soluble carbohydrate content induced by the cold acclimation affect frost tolerance and resistance to Microdochium nivale of androgenic Festulolium genotypes. Journal of Agronomy and Crop Science 196(1): 48-54. Pringle, A., Baker, D.M., Platt, J.L., Wares, J.P., Latge, J.P., and Taylor, J.W. 2005. Cryptic speciation in the cosmopolitan and clonal human pathogenic fungus Aspergillus fumigatus. Evolution 59: 1886-1899. Rozen, S., and Skaletsky, H. 2000. Primer3 on the WWW for general users and for biologist programmers. In Methods in Molecular Biology. Edited by S. Misener, and S.A. Krawetz. Humana Press, Totowa, NJ. Saitou, N., and Nei, M. 1987. The neighbour-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4: 406-425. Schoch, C.L., Seifert, K.A., Huhndorf, S., Robert, V., Spouge, J.L., Levesque, C.A., Chen, W., and Fungal Barcoding Consortium. 2012. Nuclear ribosomal internal transcribed spacer (ITS) region as a unversal DNA barcode marker for Fungi. Proceedings of the National Academy of Sciences of the U.S.A. Schoch, C.L., Sung, G.H., Lopez-Giraldez, F., Townsend, J.P., Miadlikowska, J., Hofstetter, V., Robbertse, B., Matheny, P.B., Kauff, F., Wang, Z., Gueidan, C., Andrie, R.M., Trippe, K., Ciufetti, L.M., Wynns, A., Fraker, E., Hodkinson, B.P., Bonito, G., Groenewald, J.Z., Arzanlou, M., de Hoog, G.S., Crous, P.W., Hewitt, D., Pfister, D.H., Peterson, K., Gryzenhout, M., Wingfield, M.J., Aptroot, A., Suh, S.O., Blackwell, M., Hillis, D.M., 51

Griffith, G.W., Castlebury, L.A., Rossman, A.Y., Lumbsch, H.T., Lucking, R., Budel, B., Rauhut, A., Diederich, P., Ertz, D., Geiser, D.M., Hosaka, K., Inderbitzin, P., Kohlmeyer, J., Volkmann-Kohlmeyer, B., Mostert, L., O'Donnell, K., Sipman, H., Rogers, J.D., Shoemaker, R.A., Sugiyama, J., Summerbell, R.C., Untereiner, W., Johnston, P.R., Stenroos, S., Zuccaro, A., Dyer, P.S., Crittenden, P.D., Cole, M.S., Hansen, K., Trappe, J.M., Yahr, R., Lutzoni, F., and Spatafora, J.W. 2009. The Ascomycota Tree of Life: A Phylum-wide Phylogeny Clarifies the Origin and Evolution of Fundamental Reproductive and Ecological Traits. Systematic Biology 58(2): 224-239. Schuh, R.T., and Brower, A.V.Z. 2009. Biological Systematics: principals and applications (second edition). Cornell University, Ithaca, NY. Seifert, K.A., Samson, R.A., deWaard, J.R., Houbraken, J., Lévesque, C.A., Moncalvo, J.-M., Louis-Seize, G., and Hebert, P.D.N. 2007. Prospects for fungus identification using CO1 DNA barcodes, with Penicillium as a test case. Proceedings of the National Academy of Sciences of the U.S.A. 104(10): 3901-3906. Smith, J.D. 1992. Snow mould fungi in Canada. Norwegian Journal of Agricultural Sciences Supplement(7): 5-12. Swofford, D.L. 2000. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4.0 b10. Sinauer Associates, Sunderland, MA. Taylor, J.W. 2011. One fungus = one name DNA and fungal nomenclature twenty years after PCR. IMA Fungus 2(2): 113-120. Wang, J., Levy, M., and Dunkle, L.D. 1998. Sibling species of Cercospora associated with gray leaf spot of Maize. Phytopathology 88(12): 1269-1275. Wang, Z., Nilsson, R.H., Lopez-Giraldez, F., Zhuang, W.-y., Dai, Y.-C., Johnston, P.R., and Townsend, J.P. 2011. Tasting soil fungal diversity with earth tongues: phylogenetic test of SATé alignments for environmental ITS data. PLoS ONE 6: e19039. Webster, J., and Weber, R.W.S. 2007. Introduction to Fungi, 3rd edition. Cambridge University Press, New York. White, T.J., Bruns, T., Lee, S., and Taylor, J.W. 1990. Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In PCR Protocols: A Guide to Methods and Applications. Edited by M. Innis, Gelfand, D., Sninsky, J., White, T. Academic Press, New York. pp. 315-322. Woese, C.R., and Olsen, G.J. 1986. Archaebacterial phylogeny: perspectives on the urkingdoms. Systematic and Applied Microbiology 7: 161-177.

52

Table 2.1 Isolates of Microdochium nivale and M. majus, including their geographic origin and host-plant origin used for nucleotide sequence analysis.

Genomic Regions Studied* Isolate Species Geographic origin Host plant RPB2 Btub EF1α ITS Y13 99027 M. majus Atwood, Canada Triticum sp. - + + + - 99049 M. majus Atwood, Canada Triticum sp. + + + + + 99061 M. majus Atwood, Canada Triticum sp. + + - + + 99064 M. majus Atwood, Canada Triticum sp. + - + + - 12043 M. majus Ottawa, Canada Triticum sp. - - - - + 12044 M. majus Ottawa, Canada Triticum sp. - - - - + 12045 M. majus Ottawa, Canada Triticum sp. - - - - + 12046 M. majus Ottawa, Canada Triticum sp. - - - - + 10095 M. majus Argilly, France Triticum sp. - - - - + 10096 M. majus Argilly, France Triticum sp. - - + + + 10097 M. majus Argilly, France Triticum sp. - - - - + 10098 M. majus Argilly, France Triticum sp. + + - - + 10099 M. majus Medicina, Italy Triticum sp. + + + + + 10100 M. majus Medicina, Italy Triticum sp. - - - - + Castelnaudary, 10148 M. majus Triticum sp. - - - - + France 10149 M. majus Bullion, France Triticum sp. + + + + + 10150 M. majus Aryon, France Triticum sp. - - - - + 96101 M. nivale Cambridge, Canada Agrostis palustris + + + + - 96103 M. nivale Cambridge, Canada Agrostis palustris + + + + - 96107 M. nivale Cambridge, Canada Agrostis palustris + + + + - 10085 M. nivale Guelph, Canada Agrostis palustris - - - + - 11036 M. nivale Guelph, Canada Poa pratensis - - - + - 12049 M. nivale Guelph, Canada Poa pratensis - - - - + 53

Genomic Regions Studied* Isolate Species Geographic origin Host plant RPB2 Btub EF1α ITS Y13 12150 M. nivale Guelph, Canada Poa pratensis - - - - + 12051 M. nivale Guelph, Canada Poa pratensis - - - - + 12134 M. nivale Guelph, Canada Lolium perenne - - - - + 12135 M. nivale Guelph, Canada Lolium perenne - - - - + 12136 M. nivale Guelph, Canada Lolium perenne - - - - + 12137 M. nivale Guelph, Canada Lolium perenne - - - - + 12138 M. nivale Guelph, Canada Lolium perenne - - - - + 12139 M. nivale Guelph, Canada Lolium perenne - - - - + 12141 M. nivale Guelph, Canada Lolium perenne - - - - + 12142 M. nivale Atwood, Canada Triticum sp. - - - - + 12143 M. nivale Atwood, Canada Triticum sp. - - - - + 12144 M. nivale Atwood, Canada Triticum sp. - - - - + 12145 M. nivale Ottawa, Canada Triticum sp. - - - - + 12151 M. nivale Ottawa, Canada Triticum sp. - - - - + 12152 M. nivale Ottawa, Canada Triticum sp. - - - - + 12153 M. nivale Ottawa, Canada Triticum sp. - - - - + 12154 M. nivale Ottawa, Canada Triticum sp. - - - - + 12155 M. nivale Ottawa, Canada Triticum sp. - - - - + 10082 M. nivale UK Agrostis sp. + + + + + 10083 M. nivale UK Agrostis sp. + + + + + St. Leon-Rot, 10101 M. nivale Agrostis sp. + + + + - Germany 10102 M. nivale Switzerland Agrostis sp. - - - - + 10103 M. nivale Ottobeuren, Germany Agrostis sp. - - - - + 10104 M. nivale Netherlands Triticum sp. - - - - + 10105 M. nivale Germany Agrostis sp. - - - - +

54

Genomic Regions Studied* Isolate Species Geographic origin Host plant RPB2 Btub EF1α ITS Y13 99006 M. nivale Atwood, Canada Triticum sp. + + + - - 99010 M. nivale Atwood, Canada Triticum sp. - - - - + 99063 M. nivale Atwood, Canada Triticum sp. - - - - + 99069 M. nivale Atwood, Canada Triticum sp. + + + + - 99077 M. nivale Atwood, Canada Triticum sp. - - - - + 99084 M. nivale Atwood, Canada Triticum sp. + + + + - 10106 M. nivale Medicina, Italy Triticum sp. + + + + - 10107 M. nivale Medicina, Italy Triticum sp. + + + + - 10151 M. majus Aryon, France Triticum sp. - - - + - Castelnaudary, 10152 M. nivale Triticum sp. + + + + + France 10153 M. nivale Bullion, France Triticum sp. - - - - + 10154 M. nivale Aryon, France Triticum sp. - - - - + Verdun sur Doubs, 10155 M. nivale Triticum sp. - - - - + France 10156 M. nivale Corgoloin, France Triticum sp. - - - - + * RPB2 = RNA polymerase II; Btub = β-tubulin; EF1α = elongation factor-1α; ITS = rDNA internal transcribed spacer; Y13 = region of unknown function identified by RAPD analyses (Nicholson et al. 1996). A "+" indicates that the sequence of a given genetic region was analysed for a particular isolate, and "-" indicates that this genetic region was not analysed for this isolate.

55

Table 2.2 Primers used in PCR and sequencing reactions.

Amplicon Size (bp) Genetic Region Primer Name Primer Sequence (5’-3’) Reference

530 ITS1 TCCGTAGGTGAACCTGCGG (White et al. 1990) ITS ITS4 TCCTCCGCTTATTGATATGC (White et al. 1990) EFMajF CCCCTTCTCCCTATCGC (Glynn et al. 2005) 487* or 491† EF-1α EFNivF GTTCCCCTGTCTGACTGTTGT (Glynn et al. 2005) EFMicR TCGATGGAGTCGATGG (Glynn et al. 2005) 1,500 fRPB2-5F GAYGAYMGWGATCAYTTYGG (Liu et al. 1999) RPB2 fRPB2-7cR CCCATRGCTTGYTTRCCCAT (Liu et al. 1999) ‡ 1,040 RPB150_F CTGGGGWGATCARAAGAAGG This study ca. 880 Bt2a GGTAACCAAATCGGTGCTGCTTTC (Glass 1995) Bt1b GACGAGATCGTTCATGTTGAACTC (Glass 1995) β-tubulin 767 Btub526_F CGAGCGYATGAGYGTYTACTT This study Btub1332_R TCATGTTCTTGGGGTCGAA This study Y13MF CTTGAGGCGGAAGATCGC (Nicholson et al. 1996) Unknown 220§ Y13MR ATCCCTTTTCCGGGGTTG (Nicholson et al. 1996) || 300 Y13NF CCAGCCGATTTGTGGTTATG (Nicholson et al. 1996) Unknown Y13NR GGTCACGAGGCAGAGTTCG (Nicholson et al. 1996) * for EFMajF and EFMicR when paired with M.majus isolates only † for EFNivF and EFMicR when paired with M. nivale isolates only ‡ when paired with fRPB2-7cR § when paired with M. majus only || when paired with M. nivale only

56

Table 2.3 List of RPB2 sequences used to design primers for M. nivale and M. majus (Sordariomycetes, Xylariales) with taxonomic information and GenBank accession numbers

Class Order Family Genus and Species GenBank Accession

Eurotiomycetes Eurotiales Trichocomaceae Penicillium chrysogenum XM_002568249.1

Lecanoromycetes Lecanorales Ophioparmaceae Ophioparma lapponica DQ973089.1 Leotiomycetes Helotiales Sclerotiniaceae Botrytis ricini GQ860997.1 Candida tropicalis AY485615.1 Saccharomycetes Saccharomycetales Saccharomycetaceae Debaryomyces hansenii XM_002770548.1

Bionectriaceae Peethambara spirostriata EF692516.1 Clavicipitaceae Metarhizium anisopliae FJ787323.1 Arachnocrea stipata EU710770.1 Hypocrea lixii FJ179608.1 H. voglmayrii FJ179622.2 Protocrea farinosa EU703942.1 Hypocreaceae Sordariomycetes Hypocreales P. pallida EU703944.2 Sphaerostilbella FJ442763.1 aureonitens Sporophagomyces EU710780.1 chrysostomus Cladobotryum cubitense EU710771.1 Hypocreales Fusarium virguliforme GU170599.1 Stachybotrys echinata EF692518.1 57

Class Order Family Genus and Species GenBank Accession Trichoderma ovalisporum FJ442796.1 Daldinia concentrica DQ368651.1 Sordariomycetes Xylariales Xylariaceae Seynesia erumpens AY641073.1 Xylaria hypoxylon DQ368652.1

58

Table 2.4 List of species used to design β-tubulin primers for M. nivale and M. majus with

GenBank accession numbers. Other than the Microdochium species, all species included are members of the Xylariaceae.

Genus and Species GenBank Accession Number Podosordaria mexicana GQ502719.1 Xylaria escharoidea GQ502709.1 Amphirosellinia fushanensis GQ495950.1 Astrocystis bambusae GQ495942.1 Stilbohypoxylon elaeicola GQ495933.1 Discoxylaria myrmecophila GQ487710.1 Penzigia cantareirensis GQ478220.1 Kretzschmaria guyanensis GQ478214.1 Entoleuca mammta GQ470230.1 Rosellinia merrillii GQ470229.1 Nemania macrocarpa GQ470226.1 Hypoxylon investiens FJ185299.1 Daldinia concentrica FJ185285.1 Annulohypoxylon cohaerens FJ185283.1 Whalleya microplaca EF025614.1 Theissenia cinerea EF025613.1 Kretzschmaria lucidula EF025610.1 Nemania illita EF025608.1 Creosphaeria sassafras DQ840094.1 Microdochium majus -* Microdochium majus -†

* In-house M. majus isolate 99027 † In-house M. majus isolate 99049

59

9000 Maximum parsimony 8000 Maximum likelihood 7000 Neighbour Joining 6000

5000

4000

3000

Number of publications ofNumber 2000

1000

0 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 Year

Figure 2.1 Number of research journal papers published between 2000-2012 including the terms

"maximum likelihood", "maximum parsimony", and either "neighbor joining" or "neighbour joining" obtained by searching the google scholar database using the term phylog* and either

"maximum likelihood", "maximum parsimony" or "'neighbour joining' OR 'neighbor joining'" for articles published between 2000 and 2012.

60

Figure 2.2 Gel image for European (lanes A-F) and North American (lanes G-I, K-O) isolates of

Microdochium nivale amplified with the Y13NF and Y13NR primers, which target a genetic region of unknown function. All six of the European isolates (lanes A-F) were strongly amplified, whereas weak bands are present in only two (lanes H, I) of the North American samples. A negative control, containing water instead of DNA, was included in the reaction (lane

Q). Select sizes (bp) from a 100 bp increment ladder (lane J) are indicated to the right.

61

62

Figure 2.3 Bootstrapped maximum likelihood trees for RPB2, β-tubulin, EF-1α, and ITS. The tips of the tree are labelled with isolate number, species (M or N for M. majus or M. nivale, respectively), host origin (W or T for wheat or turfgrass), and geographic origin (NA or EU for

North America or Europe). Bootstrap values out of 100 are located on the respective branches.

Nodes with less than 50% support were collapsed. Separate clades resolving the sister species and host-specific or geographic groupings are indicated by curly braces.

63

A B C D E F G H I

1000 600 400

Figure 2.4 Gel image for HindIII digest of RPB2 amplicon of M. nivale (lanes A-C) and M. majus (lanes D-F). Partial RPB2 sequences were amplified with the primers RPB_150F and fRPB2-7cR.The amplicons of M. nivale isolates were digested at two locations to produce fragments that were 125, 444, and 475 bp in length and the M. majus amplicons were digested at only one location, producing fragments that were 444 and 596 bp in length. Note that the 444 and 475 bp bands were not well-relsoved. A non-digested RPB2 amplicon (1,040 bp in length) from M. majus is included for comparison (lane G). Select sizes (bp) from a 100 bp increment ladder (lane J) are indicated to the right.

64

Appendices for Chapter 2

Appendix 2.1 Alignemnt of RPB2 sequences from Sordariomycete species. Primer-binding sites are indicated by shading.

Debaryomyces_hansenii ATTTTACAAAAAGAATTATTACCTAATATTACTCAAGAAGAAGGGTTTGA 50 Candida_tropicalis ------Protocrea_pallida ------Protocrea_farinosa ------Cladobotryum_cubitense ------Hypocrea_voglmayrii ------Trichoderma_ovalisporum ------Hypocrea_lixii ------Arachnocrea_stipata ------Sphaerostilbella_aureonitens ------Sporophagomyces_chrysostomus ------Metarhizium_anisopliae ------Stachybotrys_echinata ------Peethambara_spirostriata ------Fusarium_virguliforme ------Penicillium_chrysogenum -TTATGCAGAAGGAATTGCTGCCTCATATTTCGCAGAATGAGGGAAGTGA 49 Ophioparma_lapponica ------Seynesia_erumpens ------Daldinia_concentrica ------Xylaria_hypoxylon ------Botrytis_ricini ------

Debaryomyces_hansenii GACCAGAAAAGCTTTCTTTTTAGGTTACATGGTTAACAGATTATTGTTAT 100 Candida_tropicalis ------Protocrea_pallida ------Protocrea_farinosa ------Cladobotryum_cubitense ------Hypocrea_voglmayrii ------Trichoderma_ovalisporum ------Hypocrea_lixii ------Arachnocrea_stipata ------Sphaerostilbella_aureonitens ------Sporophagomyces_chrysostomus ------Metarhizium_anisopliae ------Stachybotrys_echinata ------Peethambara_spirostriata ------Fusarium_virguliforme ------Penicillium_chrysogenum AACCCGCAAGGCTTTCTTCCTTGGATACATGGTACACCGACTTCTTCAGT 99 Ophioparma_lapponica ------Seynesia_erumpens ------Daldinia_concentrica ------Xylaria_hypoxylon ------Botrytis_ricini ------

RPB5F:GAYGAYMGWGATCAYTTYGG Debaryomyces_hansenii GTGCCTTAGAAAGAAAAGAACCAGATGATAGAGATCATTTTGGTAAAAAG 150 Candida_tropicalis ------AGAACCAGATGATAGAGATCATTTTGGTAAAAAG 34 Protocrea_pallida ------Protocrea_farinosa ------Cladobotryum_cubitense ------Hypocrea_voglmayrii ------Trichoderma_ovalisporum ------Hypocrea_lixii ------Arachnocrea_stipata ------Sphaerostilbella_aureonitens ------Sporophagomyces_chrysostomus ------Metarhizium_anisopliae ------Stachybotrys_echinata ------Peethambara_spirostriata ------Fusarium_virguliforme ------65

Penicillium_chrysogenum GTGCTCTGGGTCGTCGTGATGTCGATGACCGTGACCACTTCGGAAAGAAG 149 Ophioparma_lapponica ------AAGAAA 6 Seynesia_erumpens ------Daldinia_concentrica ------Xylaria_hypoxylon ------Botrytis_ricini ------

Debaryomyces_hansenii AGATTAGATTTAGCAGGTCCATTATTAGCTAACTTGTTCCGTATTTTATT 200 Candida_tropicalis AGATTGGATTTAGCTGGTCCATTGTTGGCTGGTTTGTTCCGTATTTTATT 84 Protocrea_pallida ------CCTCTGTTGGCGAAGCTGTTCCGTGGTATCATG 33 Protocrea_farinosa ------GCTGTTCCGTGGTATCATG 19 Cladobotryum_cubitense ------GCTCTTCCGTGGTATCATG 19 Hypocrea_voglmayrii ------Trichoderma_ovalisporum ------GCGGGTCCATTGCTGGCCAAACTGTTCCGTGGTATCATG 39 Hypocrea_lixii ------CCTGCTGGCCAAGCTGTTCCGTGGTATCATG 31 Arachnocrea_stipata ------GCTGTTCCACGGCATCATG 19 Sphaerostilbella_aureonitens ------GCAGGTCCTCTTCTGGCGAAGCTCTTCCGTGGTATTATG 39 Sporophagomyces_chrysostomus ------GCTCTTCCGTGGCATCATG 19 Metarhizium_anisopliae ------CTTTTGGCTAAGCTCTTCCGTGGCATCATT 30 Stachybotrys_echinata ------Peethambara_spirostriata ------TATTCCGTGGCATTGTC 17 Fusarium_virguliforme ------Penicillium_chrysogenum CGACTCGATTTGGCTGGTCCGCTTTTGGCAAGTCTTTTCCGAACTCTTTT 199 Ophioparma_lapponica CGCCTCGATCTTGCGGGGCCACTCCTTGCTGGCTTGTTCAGAATGCTGTT 56 Seynesia_erumpens ------Daldinia_concentrica ------CTATTCCGCAACATAGTT 18 Xylaria_hypoxylon ------CTCTTCCGTAACATAGTT 18 Botrytis_ricini ------CCCATAGCTTGCTTACCCATAGCA-GATT 28

Debaryomyces_hansenii CAAAAAATTAACGAAAGATATCTATAAT--TATATGCAAAGATGTGTTGA 248 Candida_tropicalis CAAAAAATTAACAAAAGATATTTACAAC--TATATGCAAAGATGTGTTGA 132 Protocrea_pallida CGAA-GAGTGAACACTGAACTTGCCAAC--TATTTGAGGAGATGTGTCGA 80 Protocrea_farinosa CGAA-GAGTGAACACTGAACTTGCCAAC--TATTTGAGGAGATGTGTCGA 66 Cladobotryum_cubitense CGAA-GAATGAACACGGAACTCGCCAAC--TATCTCAGGCGATGTGTCGA 66 Hypocrea_voglmayrii ---A-GAATGAATACGGAGCTGGCCAAC--TACCTGAGACGATGTGTCGA 44 Trichoderma_ovalisporum CGCA-GAATGAATACTGAGCTGGCAAAC--TACCTGAGACGATGTGTTGA 86 Hypocrea_lixii CGAA-GGATGAACACTGAGTTGGCCAAC--TACCTGAGACGATGCGTTGA 78 Arachnocrea_stipata AGAA-GGATAAACACGGAATTAGCTAAC--TATCTCAGACGATGTGTGGA 66 Sphaerostilbella_aureonitens CGAA-GAATGAACACCGAGCTTGCCAAC--TACCTTCGAAGGTGTGTTGT 86 Sporophagomyces_chrysostomus CGAA-GGATAAATACCGAGCTTGCCAAC--TACCTGAGGCGATGCGTTGA 66 Metarhizium_anisopliae CGCA-GGATGAATACGGAATTGTCAAAT--TATCTGCGAAGATGCGTTGA 77 Stachybotrys_echinata ------ATCGA 5 Peethambara_spirostriata CGCC-GACTGAATCTGGAGTTCTCCAAC--TACATGAGGCGGTGTATTGA 64 Fusarium_virguliforme ------Penicillium_chrysogenum CACCCGAGTCACGAAGGATCTCCAACGT--TACGTCCAGCGATGCGTTGA 247 Ophioparma_lapponica CAACAAGCTTACCAAGGACGTCTACAAG--TATCTCCAGAAGTGTGTGGA 104 Seynesia_erumpens -----GAATGACCCAGGAGGTTTTGTCG--CAAATGAAGCGCTGTATAGA 43 Daldinia_concentrica CGTC-GGTTGGTCCAAGAGATCACGGCC--CATCTCCGGCGCTGCATTGA 65 Xylaria_hypoxylon CGTC-GGATGACCCAGGAGGTTCTGTCG--CACCTCAAGCGAAGTATTGA 65 Botrytis_ricini GATAACAGTTACGAGGAGACTTTGTAGCAATAACATTAGCCATGAATGAA 78

Debaryomyces_hansenii AAATGATAAAGAATTTAACTTGACATTGGCTGTCAAATCACAAACCATAA 298 Candida_tropicalis AAATGACAGTGACTTCAATCTTACGTTGGCTGTTAAATCACAAACTATTA 182 Protocrea_pallida GGGCAATCGTCATTTCAACCTGGCTGTGGGTATCAAGCCCGGAACTCTGT 130 Protocrea_farinosa GGGCAATCGTCATTTCAACCTGGCTGTGGGTATCAAGCCCGGAACTCTGT 116 Cladobotryum_cubitense AGGAAACAGGCATTTCAACCTTGCTGTCGGTATCAAGCCCGGCACTTTGT 116 Hypocrea_voglmayrii GGGTAACCGACATTTCAATCTTGCTGTTGGTATCAAACCCGGCACGCTTT 94 Trichoderma_ovalisporum GGGTAACCGCCACTTCAATCTTGCTGTTGGCATCAAGCCCGGCACGCTTT 136 Hypocrea_lixii GGGCAACCGACATTTCAACCTTGCTGTTGGTATCAAGCCCGGCACGCTTT 128 Arachnocrea_stipata AGGCAACCGACATTTCAACCTTGCCGTCGGCATTAAGCCTGGTACTCTAT 116 Sphaerostilbella_aureonitens GGGCAACCGTCATTTCAACCTTGCGGTCGGTATCAAGCCTGGTACTCTTT 136 Sporophagomyces_chrysostomus AGGCAATCGACATTTCAATCTCGCCGTTGGCATTAAGCCTGGCACCCTAT 116 Metarhizium_anisopliae GGGTAACCGGCATTTCAACTTGGCAGTTGGTATCAAGCCTGGAACACTTT 127 Stachybotrys_echinata ------CCTTTCAGCCTGGCTGGCGGTATCAAGAAGGGAACATTGA 45 Peethambara_spirostriata AGCGAACAGGCCGTTCAGCTTGCGAGGTGGCATTAAGACCGGGACTTTGA 114 Fusarium_virguliforme ------66

Penicillium_chrysogenum GACCAATCGAGAAATTTATCTCAACATTGGTATCAAGGCTAGCACATTGA 297 Ophioparma_lapponica GAACAATCGCGAGTTTAATCTTACCCTAGGGGTCAAGTCCACTACGCTCA 154 Seynesia_erumpens GGCAAACAAAGCCTTCCAGATTGAATTAGCTGTGAAGCCCGCCATAATAA 93 Daldinia_concentrica CCAAAACAGGCGCTTCCAAATCGAGCTCGCCGCTAAACCGGCCATCGTAA 115 Xylaria_hypoxylon ACAGGGCAAGCAGTTTAATATTGCGCTAGCCGTCAAGTCAAATATTATCA 115 Botrytis_ricini TA------AATTCA---TTGGACTTAAAAGCTTACCTG--ATTGTGA 114

RPB150_F: CTGGGGWGAT---CAR Debaryomyces_hansenii CTGATGGTTTGCGTTATTCTTTAGCAACGGGTAACTGGGGTGAA---CAA 345 Candida_tropicalis CTGACGGTTTAAGATATTCTTTGGCTACTGGTAACTGGGGTGAA---CAA 229 Protocrea_pallida CCAATGGTTTGAAATATTCGTTGGCGACGGGTAACTGGGGTGAT---CAG 177 Protocrea_farinosa CCAATGGTTTGAAATATTCGTTGGCGACGGGTAACTGGGGTGAT---CAG 163 Cladobotryum_cubitense CCAACGGTTTGAAATATTCTCTTGCGACGGGTAACTGGGGTGAT---CAG 163 Hypocrea_voglmayrii CTAACGGGTTGAAATATTCGCTCGCCACCGGAAACTGGGGTGAC---CAG 141 Trichoderma_ovalisporum CTAACGGATTGAAGTATTCACTCGCTACTGGAAACTGGGGTGAC---CAG 183 Hypocrea_lixii CAAACGGATTGAAGTATTCGCTTGCCACAGGCAACTGGGGTGAT---CAG 175 Arachnocrea_stipata CAAATGGTCTTAA-TATTCGCTCGCCACTGGTAACTGGGGTGAT---CAG 162 Sphaerostilbella_aureonitens CCAACGGCCTGAAGTACTCGCTTGCCACCGGCAACTGGGGTGAT---CAG 183 Sporophagomyces_chrysostomus CCAATGGTTTGAAGTACTCGCTGGCCACCGGCAACTGGGGTGAT---CAA 163 Metarhizium_anisopliae CCAACGGGTTGAAATATTCCCTTGCCACTGGCAACTGGGGAGAT---CAG 174 Stachybotrys_echinata CCAACGGTCTAAAGTACTCGCTGGCCACGGGCAACTGGGGAGAT---CAG 92 Peethambara_spirostriata CGAACGGTCTTAAGTACTCGCTTGCGACTGGTAACTGGGGCGAT---CAG 161 Fusarium_virguliforme ------Penicillium_chrysogenum CCGGTGGATTGAAGTATGCTCTTGCTACTGGTAACTGGGGCGAG---CAG 344 Ophioparma_lapponica CCAACGGGTTGAAGTACTCTCTAGCCACTGGGAATTGGGGGGAC---CAG 201 Seynesia_erumpens CTAACGGGCTCAAGTATTCTCTGGCTACGGGCAATTGGGGTGAT---CAG 140 Daldinia_concentrica CCAACGGCCTGAAATACTCGCTCGCCACAGGAAACTGGGGTGAT---CAG 162 Xylaria_hypoxylon CGAGTGGACTGAAGTACTCTCTCGCTACAGGAAATTGGGGTGAT---CAG 162 Botrytis_ricini TCTG-GGAAGGGAATAAT-GCTTGC-GCAGATACCCAAGATCATACTTGG 161

AAGAAG---G Debaryomyces_hansenii AAGAAG---GCAATGAGTTCCCGTGCTGGTGTTTCTCAAGTTTTGAATCG 392 Candida_tropicalis AGAAAA---GCCATGAGTTCAAGAGCTGGTGTCTCCCAAGTTTTGAATCG 276 Protocrea_pallida AAAAAG---GCCATGAGCTCCACAGCTGGTGTGTCTCAAGTGCTTAATCG 224 Protocrea_farinosa AAAAAG---GCCATGAGCTCCACAGCTGGTGTGTCTCAAGTGCTTAATCG 210 Cladobotryum_cubitense AAGAAG---GCCGCGAGCGCTACTGCCGGCGTGTCTCAGGTGCTTAACCG 210 Hypocrea_voglmayrii AAGAAG---GCAATGAGCTCAACCGCAGGCGTATCACAGGTGCTTAACCG 188 Trichoderma_ovalisporum AAGAAG---GCAATGAGCTCGACCGCAGGTGTATCACAGGTGCTTAACCG 230 Hypocrea_lixii AAGAAG---GCCATGAGCTCAACTGCAGGTGTGTCCCAGGTGCTTAACCG 222 Arachnocrea_stipata AAGAAG---GCCATGAGTTCCACCGCTGGTGTGTCTCAGGTGCTTAACCG 209 Sphaerostilbella_aureonitens AAGAAG---GCCGCCAGTTCCACTGCCGGCGTGTCTCAGGTGTTGAACCG 230 Sporophagomyces_chrysostomus AAGAAG---GCAATGAGCTCAACGGCTGGTGTATCTCAGGTGTTGAATCG 210 Metarhizium_anisopliae AAGAAG---GCCATGAGTTCCACTGCCGGCGTGTCCCAAGTGTTGAATAG 221 Stachybotrys_echinata AAGAAG---GCCGCGAGCTCAACAGCCGGTGTCTCGCAAGTGCTCAACCG 139 Peethambara_spirostriata AAAAAG---GCCATGAGCTCGACTGCGGGTGTCTCCCAGGTGCTCAACCG 208 Fusarium_virguliforme ------Penicillium_chrysogenum AAGAAG---GCGGCTTCCGCCAAGGCTGGTGTGTCCCAGGTGCTGAGTCG 391 Ophioparma_lapponica AAGAAG---GCGGCATCTTCCAAGGCAGGCGTCTCCCAGGTGCTCAACCG 248 Seynesia_erumpens AAGAAG---GCCATGAGTTCGACAGCAGGCGTGTCTCAGGTCTTGAATAG 187 Daldinia_concentrica AAGAAG---GCGATGAGTTCTACTGCGGGTGTGTCGCAGGTTTTGAACAG 209 Xylaria_hypoxylon AAAAAG---GCCATGAGCTCTACTGCTGGTGTTTCGCAGGTGCTCAATCG 209 Botrytis_ricini ATGAATTTCACAATGGGTCCAGACATGGGCA---GTCAGATTGATAGGCG 208

Debaryomyces_hansenii TT---ATACATATTCT----TCTACGTTGTCACATTTAAGAAGAA-CTAA 434 Candida_tropicalis TT---ATACTTACTCT----TCTACATTGTCACATTTAAGAAGAA-CCAA 318 Protocrea_pallida AT---ATACATTTGCG----TCCACATTATCACATCTTCGACGAA-CAAA 266 Protocrea_farinosa AT---ATACATTTGCG----TCCACATTATCACATCTTCGACGAA-CAAA 252 Cladobotryum_cubitense TT---ATACATTCGCC----TCCACCTTGTCCCATCTTCGACGTA-CCAA 252 Hypocrea_voglmayrii AT---ACACATTTGCT----TCGACACTCTCCCATTTGCGTCGTA-CTAA 230 Trichoderma_ovalisporum TT---ACACTTTTGCT----TCTACACTATCTCATTTGCGTCGTA-CCAA 272 Hypocrea_lixii TT---ACACGTTTGCT----TCGACCTTGTCGCATTTGCGTCGTA-CCAA 264 Arachnocrea_stipata TT---ACACATTTGCT----TCTACTTTGTCGCACTTGCGGCGTA-CCAA 251 Sphaerostilbella_aureonitens TT---ACACATTTGCA----TCAACACTCTCGCATTTGCGACGAA-CCAA 272 Sporophagomyces_chrysostomus CT---ATACCTTTGCC----TCGACGCTTTCGCATTTGCGACGAA-CAAA 252 Metarhizium_anisopliae GT---ATACTTTTGCT----TCGACACTCTCTCACTTGCGACGAA-CCAA 263 Stachybotrys_echinata CT---ACACATTTGCC----TCCACTCTTTCCCATCTGAGGCGTA-CCAA 181 Peethambara_spirostriata GT---ATACGTTTGCG----TCAACCCTGTCCCATCTTCGAAGGA-CCAA 250 Fusarium_virguliforme -T---ACACCTTTGCT----TCCACCCTTTCACATTTGCGACGAA-CCAA 41 67

Penicillium_chrysogenum TT---ACACATTTGCC----TCCTCCTTGTCTCATCTGCGCCGGA-CAAA 433 Ophioparma_lapponica CT---ATACTTTTGCT----TCTACGTTGTCGCATCTTCGACGAA-CGAA 290 Seynesia_erumpens GT---ATACTTTTGCC----TCGACTCTTTCCCATTTGAGAAGAA-CCAA 229 Daldinia_concentrica AT---ACACTTTTGCC----TCGACTCTTTCTCATTTAAGGCGAA-CGAA 251 Xylaria_hypoxylon AT---ACACGTTTGCA----TCTACCTTGTCACATTTGCGAAGAA-CGAA 251 Botrytis_ricini CCTTGACACGCTTGTTCAAATCACCACTTTCG---TCAGGACGAATCTGA 255 * ** ** * * ** * * * * *

Debaryomyces_hansenii CACTCCTATTGGTCGTGATGGTAAGATTGCAAAACCTAGACAATTGCATA 484 Candida_tropicalis TACTCCAATTGGTAGGGATGGTAAGATTGCCAAACCTAGACAATTGCATA 368 Protocrea_pallida CACTCCGATTGGAAGAGACGGTAAAATAGCTAAACCACGACAGCTGCATA 316 Protocrea_farinosa CACTCCTATTGGAAGAGACGGTAAAATAGCTAAGCCACGACAGCTGCATA 302 Cladobotryum_cubitense CACCCCCATTGGAAGAGACGGCAAGATTGCCAAGCCCCGCCAGTTGCACA 302 Hypocrea_voglmayrii TACACCCATCGGAAGAGATGGTAAGCTGGCGAAGCCTCGACAGCTTCACA 280 Trichoderma_ovalisporum TACACCCATCGGAAGAGATGGTAAGCTGGCGAAGCCTCGACAGCTCCACA 322 Hypocrea_lixii TACTCCTATCGGAAGAGATGGTAAGCTGGCAAAGCCTCGACAGCTTCACA 314 Arachnocrea_stipata CACTCCCATCGGAAGAGATGGTAAGCTGGCAAAGCCGCGACAGCTTCACA 301 Sphaerostilbella_aureonitens CACTCCTATCGGAAGAGATGGCAAGCTCGCTAAGCCTAGACAACTTCACA 322 Sporophagomyces_chrysostomus TACTCCCATCGGACGAGATGGCAAACTGGCCAAGCCTCGACAACTGCACA 302 Metarhizium_anisopliae CACACCTATTGGTAGAGATGGTAAGCTCGCTAAACCTCGTCAGCTGCACA 313 Stachybotrys_echinata TACGCCCATTGGAAGAGATGGCAAGCTGGCCAAACCTCGCCAGCTTCACA 231 Peethambara_spirostriata CACGCCTATTGGCCGAGACGGAAAGCTAGCCAAGCCTCGACAGCTTCACA 300 Fusarium_virguliforme CACCCCTATCGGACGAGATGGAAAGCTCGCCAAGCCCCGTCAGCTACACA 91 Penicillium_chrysogenum CACCCCCATTGGCAGAGATGGAAAGATCGCCAAACCTCGCCAGCTCCACA 483 Ophioparma_lapponica TACACCTATCGGCCGTGATGGTAAAATAGCAAAACCACGGCAGCTGCACA 340 Seynesia_erumpens TACTCCGGTTGGAAGAGACGGTAAACTCGCAAAGCCGCGTCAACTGCACA 279 Daldinia_concentrica TACGCCTATCGGAAGAGACGGAAAGCTTGCGAAACCTCGACAGCTGCACA 301 Xylaria_hypoxylon CACCCCAGTAGGCAGAGATGGTAAACTCGCCAAACCACGACAGCTTCACA 301 Botrytis_ricini TA-ACCAGCCTGAAGCTGTCGAGAGATGTCCAAATCTTCAGGAGT-CATG 303 * ** * * * * * * ** * * **

Debaryomyces_hansenii ACACTCATTGGGGTCTTGTCTGTCCTGCAGAAACTCCAGAAGGTCAAGCG 534 Candida_tropicalis ATACCCATTGGGGTTTGGTGTGTCCTGCAGAAACTCCCGAAGGTCAAGCG 418 Protocrea_pallida ATACACATTGGGGCCTGGTCTGTCCTGCCGAAACCCCAGAAGGACAGGCT 366 Protocrea_farinosa ATACACATTGGGGCTTGGTCTGTCCTGCCGAAACCCCAGAAGGACAGGCT 352 Cladobotryum_cubitense ACACGCATTGGGGCTTGGTGTGCCCTGCCGAAACTCCTGAGGGACAAGCT 352 Hypocrea_voglmayrii ACACGCACTGGGGTTTGGTATGCCCGGCTGAGACGCCCGAAGGTCAGGCT 330 Trichoderma_ovalisporum ACACACACTGGGGCTTGGTGTGCCCGGCTGAGACACCTGAAGGACAGGCT 372 Hypocrea_lixii ACACGCATTGGGGTTTGGTCTGCCCAGCCGAGACACCCGAAGGACAGGCT 364 Arachnocrea_stipata ATACCCACTGGGGTCTTGTCTGCCCAGCTGAAACTCCCGAAGGACAGGCT 351 Sphaerostilbella_aureonitens ACACACATTGGGGTCTGGTTTGCCCAGCCGAGACCCCTGAAGGACAGGCT 372 Sporophagomyces_chrysostomus ACACCCACTGGGGTCTGGTTTGTCCTGCCGAGACCCCCGAAGGGCAAGCT 352 Metarhizium_anisopliae ACACGCACTGGGGCTTGGTCTGTCCCGCCGAGACGCCAGAAGGTCAGGCT 363 Stachybotrys_echinata ATACCCACTGGGGGCTGGTTTGTCCCGCTGAAACCCCTGAGGGACAAGCA 281 Peethambara_spirostriata ATACGCACTGGGGTCTCGTTTGCCCGGCTGAGACACCCGAAGGCCAGGCT 350 Fusarium_virguliforme ACACCCATTGGGGTCTGGTGTGCCCAGCCGAGACGCCCGAGGGTCAGGCT 141 Penicillium_chrysogenum ATACTCACTGGGGTTTGGTCTGCCCGGCCGAGACACCTGAAGGTCAGGCT 533 Ophioparma_lapponica ACACTCATTGGGGTCTTGTGTGCCCCGCCGAAACCCCAGAAGGACAAGCT 390 Seynesia_erumpens ACACGCACTGGGGTTTGGTGTGCCCGGCAGAAACTCCGGAAGGCCAGGCC 329 Daldinia_concentrica ATACCCATTGGGGCCTGGTCTGCCCGGCAGAAACGCCCGAAGGCCAGGCT 351 Xylaria_hypoxylon ACAGCCACTGGGGCCTCGTCTGCCCGGCCGAGACCCCCGAAGGTCAAGCT 351 Botrytis_ricini GTAATCATGACAGTCTCTTCTTCCTCAGCG----TCCAGGTATTCAACCA 349 * ** * * * * * * ** * ** *

Debaryomyces_hansenii TGTGGTT--TGGTCAA------GAATTTGT--CTTTGATGTCGTGTATAT 574 Candida_tropicalis TGTGGTT--TGGTTAA------GAATTTAT--CATTGATGACTTGTATTT 458 Protocrea_pallida TGTGGTT--TGGTCAA------GAACTTGT--CTTTGATGTGTTATGTTA 406 Protocrea_farinosa TGTGGTT--TGGTCAA------GAACTTGT--CTTTGATGTGTTATGTCA 392 Cladobotryum_cubitense TGTGGTT--TAGTCAA------GAATTTGT--CTTTGATGTGTTACGTCA 392 Hypocrea_voglmayrii TGTGGCC--TGGTCAA------AAACTTGT--CTCTGATGTGCTACGTCA 370 Trichoderma_ovalisporum TGTGGTC--TGGTCAA------GAATTTGT--CTCTGATGTGCTACGTCA 412 Hypocrea_lixii TGTGGTC--TGGTCAA------GAACTTGT--CTTTGATGTGTTACGTCA 404 Arachnocrea_stipata TGTGGTT--TGGTCAA------GAACTTGT--CATTGATGTGTTACGTCA 391 Sphaerostilbella_aureonitens TGTGGTT--TGGTCAA------GAATTTGT--CCTTGATGTGTTATGTTA 412 Sporophagomyces_chrysostomus TGTGGCC--TGGTCAA------GAATTTGT--CCCTCATGTGCTATGTCA 392 Metarhizium_anisopliae TGCGGTC--TGGTCAA------GAACCTGT--CATTGATGTGTTATGTCA 403 Stachybotrys_echinata TGTGGTC--TCGTCAA------AAACTTAT--CGTTGATGTGCTACGTCA 321 Peethambara_spirostriata TGTGGTC--TGGTAAA------GAACCTCT--CCCTGATGTGTTATGTCA 390 Fusarium_virguliforme TGTGGTC--TGGTCAA------GAACTTGT--CTCTGATGTGCTATGTTA 181 68

Penicillium_chrysogenum TGTGGTC--TGGTCAA------GAACCTTG--CATTGATGTGCTACATCA 573 Ophioparma_lapponica TGTGGTT--TGGTGAA------AAATCTGG--CTCTGATGTGTTACATTA 430 Seynesia_erumpens TGCGGTC--TCGTAAA------AAATCTCT--CTCTTATGTGCTCTATCA 369 Daldinia_concentrica TGCGGTT--TGGTGAA------GAACCTGT--CGCTTATGTGCTCGATAA 391 Xylaria_hypoxylon TGTGGCC--TAGTCAA------AAATCTGT--CTCTCATGTGCTCAATCA 391 Botrytis_ricini CGCCATCGTTGATCAAACCTTGGAATCCATAGTATCCGTGTCTTACTT-- 397 * * * ** ** ** * *

Debaryomyces_hansenii CTGTTGGTACTTCATCAGAACCAATTTTGT-ATTTCTTAGAAGAATGGGG 623 Candida_tropicalis CTGTTGGGTCTTCATCCGAACCAATTTTGG-GTTTCTTGAGAGATTTCGG 507 Protocrea_pallida GTGTTGGCTCTCCTTCTGAACCTTTGATTG-AATTCATGATCAACAGAGG 455 Protocrea_farinosa GTGTGGGCTCTCCTTCTGAACCTTTGATTG-AATTCATGATCAACAGAGG 441 Cladobotryum_cubitense GTGTCGGATCCCCTTCAGAACCCCTGATTG-AGTTTATGATCAACCGAGG 441 Hypocrea_voglmayrii GTGTCGGATCTCCATCTGAGCCTCTGATCG-AATTCATGATCAACAGAGG 419 Trichoderma_ovalisporum GTGTTGGATCTCCTTCTGAGCCTCTGATCG-AGTTTATGATCAACAGAGG 461 Hypocrea_lixii GTGTCGGTTCTCCCTCTGAACCTCTCATTG-AGTTCATGATCAACAGAGG 453 Arachnocrea_stipata GTGTTGGTTCGCCATCCGAGCCATTGATTG-AGTTCATGATCAACCGAGG 440 Sphaerostilbella_aureonitens GTGTCGGTTCTCCTTCCGAGCCTTTGATCG-AGTTTATGATCAACAGAGG 461 Sporophagomyces_chrysostomus GTGTTGGTTCTCCCTCAGAACCCCTGATTG-AGTTCATGATCAACCGAGG 441 Metarhizium_anisopliae GTGTGGGTTCACCGGCCGAGCCATTGATTG-AATTCATGATCAACCGTGG 452 Stachybotrys_echinata GTGTTGGTTCTCCCGCAGAGCCATTGATCG-AATTCATGATCAACCGTGG 370 Peethambara_spirostriata GTGTCGGCTCGCGGGTCGACCCCCTTATCG-AATTCATGATCCAGCGAGG 439 Fusarium_virguliforme GTGTCGGTTCTCCCTCTGAACCTCTGATCG-AGTTCATGATCAACCGAGG 230 Penicillium_chrysogenum CTGTTGGTACACCTGCTGAACCTATCGTGG-ATTTCATGATTCAGCGGAA 622 Ophioparma_lapponica CTGTTGAAACTCCAAGCGAACCTATAATTG-ACTTCATGATCCAGAGAAA 479 Seynesia_erumpens GCGTGGGCACATCGACGGAACCAATAATCG-ATTATATGATTACCCGGAA 418 Daldinia_concentrica GTGTGGGCACGTCAACGGATCCCATCGTCG-ACTATATGATCACGAGAAA 440 Xylaria_hypoxylon GTGTTGGCACGTCTACGGAACCTATTATCG-ATTACATGATTTCACGTAA 440 Botrytis_ricini -TGTC-ATCCTTATCCATATTTGCTGGTAGCAATTGATCATCTTCCAGAC 445 ** * * * * * *

Debaryomyces_hansenii TATGGA---ACCTTTGGAAGATTATGTACCTTCGAATTCTCCCGATTCAA 670 Candida_tropicalis TTTAGA---AGTCTTGGAAGATTATGTTCCATCCAATGCTCCAGATTCCA 554 Protocrea_pallida TATGGA---AGTGGTCGAGGAGTATGAGCCTTTGAGATATCCACATGCAA 502 Protocrea_farinosa TATGGA---AGTGGTCGAGGAGTATGAGCCTTTGAGATATCCACATGCTA 488 Cladobotryum_cubitense TATGGA---AGTCGTTGAGGAGTATGAACCCCTGAGATACCCTCACGCAA 488 Hypocrea_voglmayrii CATGGA---GGTCGTCGAAGAGTACGAACCACTGAGGTATCCTCATGCGA 466 Trichoderma_ovalisporum CATGGA---AGTTGTTGAGGAGTACGAGCCACTGAGATATCCCCATGCTA 508 Hypocrea_lixii TATGGA---AGTCGTCGAAGAGTACGAGCCTCTGCGGTATCCTCATGCTA 500 Arachnocrea_stipata CATGGA---AGTTGTTGAAGAATATGAACCTTTACGCTACCCCCATGCCA 487 Sphaerostilbella_aureonitens TATGGA---AGTAGTGGAGGAGTATGAGCCGCTGCGATATCCCCATGCCA 508 Sporophagomyces_chrysostomus CATGGA---AGTTGTTGAAGAGTATGAGCCTTTGCGTTACCCACACGCCA 488 Metarhizium_anisopliae CATGGA---AGTGGTAGAAGAGTACGAGCCGCTGAGATATCCTCATGCCA 499 Stachybotrys_echinata AATGGA---GGTTGTTGAAGAGTACGAACCTCTCCGGTACCCGCACGCGA 417 Peethambara_spirostriata CATGGA---GGTGATTGAGGAGTATGAGCCACTGCGTTATCCGCACGCCA 486 Fusarium_virguliforme TATGGA---GGTCGTGGAAGAGTACGAACCCCTGAGATACCCGCATGCCA 277 Penicillium_chrysogenum CATGGA---AGTTCTCGAGGAGTTTGAACCCCAAGTGACGCCTAATGCAA 669 Ophioparma_lapponica TATGGA---GGTCTTAGAAGAGTACGAACCTCAACGCTCGCCAAATGCCA 526 Seynesia_erumpens CATGGA---AGTTCTTGAAGAGTACGAGCCATTGAGATATCCTAACGCGA 465 Daldinia_concentrica TATGGA---AGTGTTGGAGGAATACGAACCCATGCGCTATCCTAACGCCA 487 Xylaria_hypoxylon TATGGA---GGTTCTAGAGGAGTACGATCATCACAGGTATCCTAATGCCA 487 Botrytis_ricini GGCGAATATGATCCTTGTTCAATACCAAGTT---ACCTTTGTTAGCGCTG 492 * * * * * *

Debaryomyces_hansenii CGAGAGTCTTTGTTAATGGT--GTCTGGGTTGG--TACTCATAGAGAACC 716 Candida_tropicalis CTAGAATTTTCGTCAATGGT--GTTTGGGTTGG--TGTTCACAGAGATCC 600 Protocrea_pallida CAAAGATTTTCGTCAACGGA--GTTTGGGTTGG--TGTGCACCAAGATCC 548 Protocrea_farinosa CGAAAATTTTCGTCAACGGA--GTTTGGGTTGG--TGTGCACCAAGATCC 534 Cladobotryum_cubitense CTAAGATTTTTGTCAACGGT--GTTTGGGTCGG--CGTGCATCAGGACCC 534 Hypocrea_voglmayrii CAAAGATCTTTGTAAACGGT--GTTTGGGTCGG--AATCCACCAAGACCC 512 Trichoderma_ovalisporum CAAAGATCTTTGTGAATGGT--GTCTGGGTTGG--AGTTCACCAAGATCC 554 Hypocrea_lixii CAAAGATTTTTGTGAACGGT--GTCTGGGTTGG--AGTCCACCAAGACCC 546 Arachnocrea_stipata CTAAGATCTTTGTCAACGGA--GTCTGGGTCGG--AGTACATCAAGATCC 533 Sphaerostilbella_aureonitens CCAAGATCTTTGTAAACGGT--GTCTGGGTTGG--TGTGCACCAAGACCC 554 Sporophagomyces_chrysostomus CCAAGATTTTCGTCAATGGT--GTATGGGTTGG--CGTGCATCAGGATCC 534 Metarhizium_anisopliae CCAAGATCTTTGTCAATGGT--GTTTGGGTTGG--TGTACACCAAGATCC 545 Stachybotrys_echinata CCAAAATTTTTGTCAATGGA--GTTTGGGTTGG--CGTTCACCAGGACCC 463 Peethambara_spirostriata CTAAAATTTTCGTCAACGGC--GCCTGGATCGG--TGTCCATCAGGACCC 532 Fusarium_virguliforme CCAAGATCTTTGTCAACGGT--GTCTGGTGCGG--TGTCCACTCGGACCC 323 69

Penicillium_chrysogenum CAAAGGTGTTTGTCAATGGT--GTCTGGGTGGG--TATTCACCGGGATCC 715 Ophioparma_lapponica CCAAGGTCTTTGTAAACGGT--GTATGGGTGGG--AGTCCATCGAGATCC 572 Seynesia_erumpens CCAAGATCTTCCTGAACGGT--TCCTGGATAGG--AGTGCACCAAGATCC 511 Daldinia_concentrica CCAAGATCTTCCTCAACGGA--TCTTGGATTGG--TGTGCACCAGGATCC 533 Xylaria_hypoxylon CTAAGATTTTCCTTAATGGT--GCGTGGATCGG--CGTCCATCAGGATCC 533 Botrytis_ricini TCAGGATCATTGTCGATGACCAATAAAGGTCGCACTGTCCATGGTAAATC 542 * * * * * * * * ** * *

Debaryomyces_hansenii TGCTCATTTGGTTGACACTATGCGTAATTTA---AGAAGAAGAGGTGAT- 762 Candida_tropicalis AGCACAATTGGTTGATTATGTTCGTGATCTT---AGAAGAAGTGGTGAT- 646 Protocrea_pallida CAAGCACCTGGTCAGCCAAGTCCTTGATACT---CGTCG--TAAATCTT- 592 Protocrea_farinosa CAAGCACCTGGTCAGCCAAGTCCTTGATACT---CGTCG--TAAATCTT- 578 Cladobotryum_cubitense CAAGCATTTGGTCAACCAAGTCCTCGATACT---CGTCG--TAAATCCT- 578 Hypocrea_voglmayrii CAAGCATCTGGTGAACCAAGTTCTGGACACC---CGTCG--CAAGTCCT- 556 Trichoderma_ovalisporum CAAGCATCTGGTAAACCAAGTTTTGGATACT---CGTCG--CAAATCCT- 598 Hypocrea_lixii TAAGCACTTGGTGAACCAGGTTCTGGACACT---CGTCG--CAAGTCCT- 590 Arachnocrea_stipata TAAGCACCTCGTCAGCCAGGTTCTCGACACT---CGTCG--CAAATCAT- 577 Sphaerostilbella_aureonitens TAAGCATTTGGTGAACCAAGTTCTCGACACT---CGTCG--CAAGTCCT- 598 Sporophagomyces_chrysostomus CAAGCATTTGGTCAATCAAGTTCTCGACACC---CGCCG--CAAGTCCT- 578 Metarhizium_anisopliae TAAGCACCTGGTCAGTCAAGTCTTGGATACT---AGACG--AAAGTCGT- 589 Stachybotrys_echinata CAAACATCTGGTTAGCCAAGTCCTAGATACA---CGTCG--CAAGTCGT- 507 Peethambara_spirostriata CAAGCATCTGGTCAACGCCGTTATGGATACT---CGCCG--CAGGTCGG- 576 Fusarium_virguliforme CAAGCATCTCGTCAGTCAAGTCCTGGACACG---CGACG--AAAGTCGT- 367 Penicillium_chrysogenum TTCGCATCTTGTTACTACGATGCAGAATCTG---CGTCGACGAAACATG- 761 Ophioparma_lapponica AGCTCATCTTGTCAGCACAGTGCAGAATTTA---CGACGACGGCATCTG- 618 Seynesia_erumpens CAAGGCTCTCGTCAAAGATGTTCAGCAACTG---CGTCGCACGAATCAA- 557 Daldinia_concentrica CAAGTCTCTCGTGAGAGATGTGCAGCAGCTT---CGCCGGGCCAACCAG- 579 Xylaria_hypoxylon TAAGTCTCTTGTGAGGGATGTGCAACAGTTG---CGCCGAACGAATCAG- 579 Botrytis_ricini GA---ACTTCGTCA--CGATTTCTGAATGGTGAACTTCAGAGAGATCAGT 587 * ** * *

Debaryomyces_hansenii --ATTTCC----CCCG---AAGTTTCTATTATTAGAGATATAAGAGAGAA 803 Candida_tropicalis --ATTTCT----CCAG---AAGTTTCCATCATTAGAGATATTAGAGAAAA 687 Protocrea_pallida --ATCTACAG--TATG---AAGTGTCTTTGATCCGAGACATCAGAGATCA 635 Protocrea_farinosa --ATTTACAG--TATG---AAGTGTCCTTGATCCGAGACATCAGGGATCA 621 Cladobotryum_cubitense --ATCTGCAA--TACG---AAGTCTCCCTCATCAGAGAAATCAGAGATCA 621 Hypocrea_voglmayrii --ATCTACAA--TACG---AAGTCTCTCTGATCAGAGACATTCGTGACCA 599 Trichoderma_ovalisporum --ATCTGCAG--TACG---AAGTCTCTCTGATTAGAGAAATTCGAGACCA 641 Hypocrea_lixii --ATCTGCAA--TACG---AAGTCTCTCTCGTGAGAGAAATTCGAGACCA 633 Arachnocrea_stipata --ATTTGCAA--TACG---AAGTCTCTCTTATCAGAGATATTAGAGATCA 620 Sphaerostilbella_aureonitens --ATCTGCAG--TACG---AAGTTTCTCTTATTAGGGAAATTCGAGATCA 641 Sporophagomyces_chrysostomus --ATTTACAG--TACG---AAGTCTCACTCATCAGGGAAATTCGAGATCA 621 Metarhizium_anisopliae --ATTTGCAG--TACG---AGGTGTCTCTCGTCCGAGAAATCAGGGATCA 632 Stachybotrys_echinata --ATGTCCAG--TATG---AGGTTTCCCTCGTCAGAGATATTCGGGATCA 550 Peethambara_spirostriata --TCATGCAG--TTTG---AGGTTTCCCTCGTCAGAGATATCAGAGATCA 619 Fusarium_virguliforme --ATCTGCAG--TACG---AGGTGTCACTCGTTCGTGACATTCGAGACCG 410 Penicillium_chrysogenum --ATCTCC----CATG---AAGTCAGTTTGATTCGTGACATCCGTGAACG 802 Ophioparma_lapponica --ATCTCC----CATG---AAGTCAGTCTGGTCCGTGATATCCGAGATCG 659 Seynesia_erumpens --ATCCCG----GCTG---AAGTTTCTCTCGTTCGCGATATTCGAGATCG 598 Daldinia_concentrica --ATCCCC----TCCG---AAGTATCTCTGGTTCGCGATATTCGTGATCG 620 Xylaria_hypoxylon --ATCCCA----GCGG---AGGTATCCTTGATTCGAGATATTCGCGACCG 620 Botrytis_ricini GGATCTACGAGATCTGTACACATTGACCAAATGAGCAGGATCTGATGATA 637 * * * * * ** *

Debaryomyces_hansenii AGAATTCAAGATTTTCACTGATG--CTGGTCGTGTTTACCGTCC--ATTA 849 Candida_tropicalis AGAATTTAAAATCTTTACCGATG--CTGGTCGTGTTTACCGTCC--ACTT 733 Protocrea_pallida AGAGTTCAAGATCTTTTCAGATG--CAGGACGGGTTATGCGCCC--CGTA 681 Protocrea_farinosa AGAGTTCAAGATCTTCTCTGATG--CAGGACGGGTTATGCGCCC--CGTA 667 Cladobotryum_cubitense AGAGTTCAAGATTTTCTCCGATG--CGGGACGAGTTATGCGACC--TGTC 667 Hypocrea_voglmayrii AGAATTCAAGATCTTCTCTGACG--CCGGTCGTGTGATGCGTCC--TGTA 645 Trichoderma_ovalisporum AGAGTTCAAAATCTTCTCTGATG--CCGGTCGTGTTATGCGTCC--CGTC 687 Hypocrea_lixii GGAATTCAAAATCTTTTCCGACG--CTGGCCGTGTCATGCGACC--AGTC 679 Arachnocrea_stipata AGAGTTCAAGATTTTCTCTGACG--CGGGCCGCGTTATGCGGCC--TGTT 666 Sphaerostilbella_aureonitens GGAATTCAAGATTTTCTCGGACG--CAGGTCGTGTGATGAGACC--AGTT 687 Sporophagomyces_chrysostomus AGAATTCAAAATCTTTTCTGATG--CAGGCCGTGTTATGAGACC--GGTT 667 Metarhizium_anisopliae AGAGTTCAAGATTTTCTCCGATG--CTGGCCGAGTTATGAGACC--AGTG 678 Stachybotrys_echinata GGAGTTCAAGATTTTCTCTGACG--CTGGACGAGTCATGCGTCC--TGTT 596 Peethambara_spirostriata GGAGTTCAAGATTTTCTCCGACG--CGGGTCGTGTGATGCGACC--AGNA 665 Fusarium_virguliforme AGAGTTCAAGGTCTTTTCCGACG--CCGGCCGAGTCATGAGACC--AGTC 456 70

Penicillium_chrysogenum GGAGTTCAAGATCTTCACCGATA--CTGGACGTGTCTGCCGGCC--ACTC 848 Ophioparma_lapponica AGAGTTTAAGATTTTCACGGACG--CAGGGAGAGTGTGTCGACC--GCTC 705 Seynesia_erumpens AGAGTTCAAGATCTTCTCGGACG--CTGGTCGTGTAATGCGGCC--GTTG 644 Daldinia_concentrica CGAGTTCAAGATTTTCTCGGATG--CCGGTCGTGTCATGCGGCC--CTTA 666 Xylaria_hypoxylon CGAATTCAAGATCTTCTCAGATG--CCGGTCGCGTCATGCGACC--CTTG 666 Botrytis_ricini CCAACCCAACACCATTGACGAAAACCTTGTTGCATTAGGTGCTCGGAGTG 687 * ** * ** * * * * * *

Debaryomyces_hansenii TTTATTGTC------GA-----TGATGACGCAGAATCCGAAAC-C---AA 884 Candida_tropicalis TTCATTGTT------GA-----TGATAATGAAGATTCTCCAAC-T---AA 768 Protocrea_pallida TTCACTGTGCAGCAGGA-----AGATGACCCTGAGACGGGTAT-TAACAA 725 Protocrea_farinosa TTTACTGTGCAGCAGGA-----AGATGACCCTGAGACGGGTAT-CAACAA 711 Cladobotryum_cubitense TTCACCGTGCAGCAAGA-----GGATGATCCTGAAACGGGCAT-CAACAA 711 Hypocrea_voglmayrii TTCACTGTGCAGCAAGA-----AGATGACCCCGAAACGGGCAT-AAACAA 689 Trichoderma_ovalisporum TTCACCGTACAGCAGGA-----AGATGACCCGGAAACGGGTAT-CAACAA 731 Hypocrea_lixii TTTACCGTTCAACAGGA-----AGATGACCCGGAAACGGGCAT-CAACAA 723 Arachnocrea_stipata TTCACAGTTCAGCAGGA-----AGATGATGCCGAGACGGGCAT-TAACAA 710 Sphaerostilbella_aureonitens TTTACTGTTCAACAAGA-----GGATGACCCTGAGACGGGAAT-CAACAA 731 Sporophagomyces_chrysostomus TTCACTGTTCAGCAAGA-----GAATGATCCCGAGACAGGTAT-CGACAA 711 Metarhizium_anisopliae TTCACTGTGCAGCAAGA-----AGACGATCCCGAGACTGGCAT-TGAAAA 722 Stachybotrys_echinata TTCACGGTTCAGCAGGA-----AGATGACATTGAGACTGGGGT-TCAGAA 640 Peethambara_spirostriata TTCACTGTCTTGCAGGA-----GGATAACCCGGAAACAGGACT-CCAGAA 709 Fusarium_virguliforme TTTACGGTTCAGCAGGA-----GGACGACCATGACTCTGGTAT-TGCCAA 500 Penicillium_chrysogenum TTCGTTATT------GA-----TAATGATCCCAAGA---GTGA-AAACTC 883 Ophioparma_lapponica TTCGTCATC------GA-----CAATGACCCCAAAA---GCCT-CAACAA 740 Seynesia_erumpens TTTGTTGTCGAGCAAGA-----AGATAATCCCGAGACGGGCGC-AGGTAA 688 Daldinia_concentrica TTCGTGGTGCAGCAAGA-----AGATGATCCCGAGGCTGGAAT-CACGAA 710 Xylaria_hypoxylon TACGTAGTCGAGCAAGA-----GGATGATCCTGAAAATGGCAT-CGAGAA 710 Botrytis_ricini GTCATACTCCTCCAACACTTCCATATTTCGTTGAATCATGAATTCGACGA 737 * * * *

Debaryomyces_hansenii --GGGTGAATTAA--AATTAAAGAAAG-AAAATGTACAT--AAATTAATT 927 Candida_tropicalis --AGGTGACTTGA--TGATTACCAAAG-AACATATTAGA--AAATTAGTT 811 Protocrea_pallida --GGGCCACCTAG--TCCTAACAAAGG-AGTTGGTGAAT--AAGCTTGCC 768 Protocrea_farinosa --GGGCCACCTGG--TCCTAACAAAGG-AGTTGGTGAAT--AGGCTTGCC 754 Cladobotryum_cubitense --GGGTCATCTCG--TTTTGACCAAGG-ACTTGGTGAAC--AGACTTGCA 754 Hypocrea_voglmayrii --AGGCCACTTAG--TATTGACCAAAG-ATCTCGTCAAT--AGACTGGCA 732 Trichoderma_ovalisporum --GGGCCACCTGG--TTTTGACCAAGG-AACTCGTCAAT--AGATTGGCT 774 Hypocrea_lixii --GGGCCACCTGG--TATTGACCAAGG-AGCTCGTCAAT--AGATTGGCC 766 Arachnocrea_stipata --GGGCCATCTTG--TCTTGACTAAAG-ATTTGGTGAAT--AGGCTGGCA 753 Sphaerostilbella_aureonitens --GGGCCACTTGG--TGTTGACAAAGG-ACTTGGTGAAT--AAGCTTGCA 774 Sporophagomyces_chrysostomus --GGGACACTTGG--TCCTAACCAAGG-ATCTGGTCAAC--AGGCTTGCG 754 Metarhizium_anisopliae --AGGCCATCTGG--TTTTGACCAAAG-ACTTGGTTAAC--AAGCTTGCG 765 Stachybotrys_echinata --GGGGCAGTTAG--TGTTGACCAAGG-ATCTGGTCAAT--CGCCTGGCT 683 Peethambara_spirostriata --AGGCCAGCTGG--CGCTGACCAAAG-ACATGGTCAAT--ACGTTGGCA 752 Fusarium_virguliforme --GGGAGCCTTGG--TTCTGACCAAGG-ACCTCGTCAAC--AAGATCGCC 543 Penicillium_chrysogenum --GGGCGGATTGG--TCCTTAACAAGG-AACACATTCGG--AAGCTCGAG 926 Ophioparma_lapponica --CGGCAACCTTG--TTCTCACCAAGG-AGCACGTCAAC--AGACTCGAA 783 Seynesia_erumpens --AGGTACCCTAG--TCCTGAATAAGG-AAACAGTGCGA--AGACTCGAG 731 Daldinia_concentrica --GGGCTCGCTAG--CTCTTACCAAGG-AGATGATTCAG--AGATTGGAG 753 Xylaria_hypoxylon --GGGCACCCTAG--TCTTGACGAAAG-ACATGGTTAGG--AAACTCGAG 753 Botrytis_ricini TTGGATCACTTGGCGTACCGACTGACGTAACACATCAACCAAGTCTTACC 787 * * * * * * * *

Debaryomyces_hansenii A------ATGCCGAA------TATGATGATTTCGGCGACGAT 957 Candida_tropicalis GCTGAAGATATGGAT------GATGATGAACTAGAGGAAGAT 847 Protocrea_pallida A------AGGAG------CAGGCTGAGCCTCCTGAGGAC 795 Protocrea_farinosa A------AGGAG------CAGGCTGAGCCTCCTGAGGAC 781 Cladobotryum_cubitense A------AGGAA------CAGGCCGAACCCCCAGAGGAC 781 Hypocrea_voglmayrii A------AGGAG------CAGGCTGAGCCTCCGGAAGAC 759 Trichoderma_ovalisporum A------AAGAG------CAGGCTGAGCCTCCTGAAGAC 801 Hypocrea_lixii A------AGGAG------CAGGCTGAGCCTCCGGAAGAC 793 Arachnocrea_stipata A------AGGAG------CAGGCTGAGCCCCCAGAGGAC 780 Sphaerostilbella_aureonitens A------AAGAT------CAAGCCGAGCCTCCGGAAGAC 801 Sporophagomyces_chrysostomus A------AAGAA------CAAGCCGAGCCTCCCGAAGAC 781 Metarhizium_anisopliae A------AAGAA------CAAGCTGAACCGCCTGAAGAC 792 Stachybotrys_echinata G------AAGAG------CAAGCTGATCCTCCGGAGGAT 710 Peethambara_spirostriata G------AAGAG------CAAGCAAACCCATTACAGAGC 779 Fusarium_virguliforme A------AGGAG------CAGGCGGAGCCACCAGAGGAC 570 71

Penicillium_chrysogenum G------CCGACAAAGA-CTTGCCAACAGACATGGCACCAGAGGAAC 966 Ophioparma_lapponica G------ACGATAAGGAGCTTGG-ACCAAATATGGATGCGGAAGAGA 823 Seynesia_erumpens G------TCGAC------CA--GACC-CTGCCGCCTGGAAGC 758 Daldinia_concentrica G------CGAG------CGTCGATC-TGGAC-CCGGAGAGC 780 Xylaria_hypoxylon T------ACGAT------CA---ATCGCTCCCTCCGGGAAGT 780 Botrytis_ricini AAACCACAGCTTGAC------CTTCGGGTGTCTCTGCCGGGCAGA 826 **

Debaryomyces_hansenii CCAGATAGTT-TGAGT------TATTCTTGGTCATCATTAG------T 992 Candida_tropicalis GAAGAAGGAGGTGAATCTAGAAAATACACATGGTCTTCATTGG------T 891 Protocrea_pallida CCTAGCATGA--AGCT------TGGTTGGGAAGGACTCA------T 827 Protocrea_farinosa CCTAGTATGA--AGCT------TGGTTGGGAAGGACTCA------T 813 Cladobotryum_cubitense CCGAGCATGA--AGCT------TGGTTGGGAAGGTCTCA------T 813 Hypocrea_voglmayrii CCAAGTATGA--AGCT------TGGATGGGAGGGGTTAA------T 791 Trichoderma_ovalisporum CCAAGCCAGA--AGCT------TGGATGGGAAGGGTTGA------T 833 Hypocrea_lixii CCCAGCATGA--AGAT------TGGATGGGAGGGATTGA------T 825 Arachnocrea_stipata CCAAGCATGA--AGAT------TGGCTGGGAGGGATTGA------T 812 Sphaerostilbella_aureonitens CTTAGCATGA--AGAT------TGGCTGGGAAAGCTTGA------T 833 Sporophagomyces_chrysostomus CCCAGCATGA--AACT------GGGTTGGGAGGGCTTGA------T 813 Metarhizium_anisopliae CCAAGCGAGA--AAAT------TGGCTGGGAAGGACTGA------T 824 Stachybotrys_echinata CCAGACCAGA--AGTT------TGGTTGGAAAGGATTGA------T 742 Peethambara_spirostriata CAAGAGTCGA--AGCT------TGGCTGGCAAGGGTTGA------T 811 Fusarium_virguliforme CCATCAATGA--AGAT------TGGATGGGAGGGTCTGA------T 602 Penicillium_chrysogenum GCCGCGAACAGTACTT------CGGATGGGATGGCCTGG------T 1000 Ophioparma_lapponica GAGAAGCTAAATTCAT------GGGATGGGAAGGCTTGG------T 857 Seynesia_erumpens GAGG-----ACTACTT------CGGCTGGCAAGGCCTAG------T 787 Daldinia_concentrica GAGG-----AGTA-TT------TGGTTGGCAAGGCCTAG------T 808 Xylaria_hypoxylon GAGG-----ATTACTT------CGGATGGCAAGGCCTGG------T 809 Botrytis_ricini CCAAGCCCCAGTGAGTATT----ATGTAGTTGTCTAGGTTTGGCGATCTT 872 * ** * *

Debaryomyces_hansenii CAATGACGGTATTGTAGAATATGTTGATGCAGAAGAA--GAAGAAACCAT 1040 Candida_tropicalis GAGTGAAGGTATTGTTGTATATGTTGATGCTGAAGAA--GAAGAAACTAT 939 Protocrea_pallida TAGGGCAGGAGCTGTGGAATATCTAGATGCTGAGGAA--GAGGAGACGTC 875 Protocrea_farinosa TAGGGCAGGAGCTGTGGAATATCTAGATGCTGAGGAA--GAGGAGACGTC 861 Cladobotryum_cubitense TCGGGCGGGTGCTGTCGAGTATCTAGACGCCGAAGAA--GAGGAAACCTC 861 Hypocrea_voglmayrii TAGGGCTGGTGCGGTGGAATATCTCGATGCAGAGGAA--GAAGAAACGTC 839 Trichoderma_ovalisporum CAGGGCTGGTGCGGTGGAATATCTCGACGCCGAGGAA--GAAGAAACGTC 881 Hypocrea_lixii CAGGGCTGGTGCGGTTGAATATCTCGACGCCGAGGAA--GAGGAGACGTC 873 Arachnocrea_stipata TCCAGCTGGTGCGGTTGAATATCTCGATGCTGAGGAA--GAGGAGACATC 860 Sphaerostilbella_aureonitens TCGTGCAGGTGCCGTCGAGTATCTCGATGCAGAGGAA--GAGGAGACGTC 881 Sporophagomyces_chrysostomus TCGTGCAGGTGCCGTTGAGTACCTTGATGCCGAGGAA--GAAGAGACCTC 861 Metarhizium_anisopliae TCGCGCCGGCGCCGTCGAGTATCTTGATGCTGAAGAA--GAAGAGACAGC 872 Stachybotrys_echinata CGCGGCGGGTGCGGTCGAATATCTGGACGCCGAAGAA--GAGGAAACTGC 790 Peethambara_spirostriata CAAGAATGGTGCCGTCGAGTATCTGGATGCTGAAGAG--GAGGAGACATC 859 Fusarium_virguliforme CCGAGCTGGTACTATCGAGTACCTCGATGCTGAAGAA--GAGGAGTCGGC 650 Penicillium_chrysogenum GCGTTCAGGAGCAGTTGAGTATGTCGACGCTGAAGAA--GAGGAAACTAT 1048 Ophioparma_lapponica TAACCAAGGTGTTGTGGAATACGTCGATGCGGAGGAA--GAAGAAACTGT 905 Seynesia_erumpens AAATGGGGGTATGATTGAGTATCTTGACGCGGAAGAA--GAAGAGACGGC 835 Daldinia_concentrica TAACGAAGGTGTTATCGAGTACCTCGACGCGGAAGAA--GAGGAGACGGC 856 Xylaria_hypoxylon TCGTGCTGGTGTGATCGAATATATGGATGCCGAGGAA--GAGGAGACTGC 857 Botrytis_ricini TCCATCACGTCCAAT-GGGTGTATTGGTTCGGCGCAAATGGGAAAGTGTT 921 * * * * * * * * * * *

Debaryomyces_hansenii AATG-ATTGCTATGACTCCAGAAGACTTGGAAACAAGCAGAAGTACTTTA 1089 Candida_tropicalis AATG-ATTGCGATGAGTCCTGACGATGTCAAAGCATCCAAGAGTACCATG 988 Protocrea_pallida TATG-ATTTGCATGACGCCTGAAGATCTGGAG------CTCTA 911 Protocrea_farinosa TATG-ATTTGCATGACGCCTGAAGATCTGGAG------CTCTA 897 Cladobotryum_cubitense CATG-ATTTGCATGACCCCAGAAGATCTAGAG------CTGTA 897 Hypocrea_voglmayrii CATG-ATTTGCATGACACCAGAGGATCTTGAG------CTTTA 875 Trichoderma_ovalisporum TATG-ATTTGCATGACACCGGAAGATCTTGAG------CTTTA 917 Hypocrea_lixii CATG-ATCTGCATGACGCCAGAGGATCTCGAG------CTGTA 909 Arachnocrea_stipata CATG-ATCTGTATGACGCCCGAAGACCTGGAG------CTTT- 895 Sphaerostilbella_aureonitens GATG-ATTTGCATGACGCCAGAAGATCTGGAA------ATGTA 917 Sporophagomyces_chrysostomus GATG-ATCTGCATGACGCCTGAGGATCTGGAG------ATGTA 897 Metarhizium_anisopliae AATG-ATCTGCATGACGCCAGAAGATCTCGAG------CTATA 908 Stachybotrys_echinata CATG-ATATGCATGACACCGGAGGATCTAGAG------TTGTA 826 Peethambara_spirostriata TATG-ATTTGTATGACACCGGAAGACTTGGAG------TTGTA 895 Fusarium_virguliforme CATG-ATTTGCATGACTCCTGAGGATCTCGAC------CTCTA 686 72

Penicillium_chrysogenum CATG-ATTGTCATGACCCCTGAGGATCTTGAG------ATCTC 1084 Ophioparma_lapponica GATG-ATCGTCATGACACCAGAAGATCTTGAG------ATCTC 941 Seynesia_erumpens CATG-ATATGTATGACGCCCGAAGATCTGGAG------CTGTA 871 Daldinia_concentrica TATG-ATTTGCATGACACCCGAAGATTTGGAA------ACTTA 892 Xylaria_hypoxylon AATG-ATTTGCATGACCCCTGAGGACTTGGAA------A-GTT 892 Botrytis_ricini GATGCAAAGGTATATCTGTTCAACACTTGCGA------TACA 957 *** * ** * * *

Debaryomyces_hansenii TCAGAGACAGAAC----AG------AAGGA----TATGCAACTTG-A-- 1121 Candida_tropicalis TCAGAAAGTGAAC----AA------CAAGA----ATTACAATTAC-A-- 1020 Protocrea_pallida TCGTCTTCAGAA-----GG------CTGGT----GTGGAGATGGATG-- 943 Protocrea_farinosa CCGTCTTCAGAA-----GG------CTGGT----GTGGAGATGGATG-- 929 Cladobotryum_cubitense CCGACTCCAGAA-----AG------CAGGT----GTCGCTATGGATG-- 929 Hypocrea_voglmayrii TCGTCTTCAGAA-----AG------CAGGC----ATCGCCACAGACG-- 907 Trichoderma_ovalisporum TCGTCTTCAGAA-----GG------CCGGC----ATTGCCACGGACG-- 949 Hypocrea_lixii TCGTCTTCAGAA-----GG------CCGGT----ATTAACACCGAGG-- 941 Arachnocrea_stipata TCGCCTTCAGAA-----AG------CTGGT----ATC------917 Sphaerostilbella_aureonitens CCGTCTCCAGAA-----GG------CCGGT----GTT------939 Sporophagomyces_chrysostomus TCGTCTGCAGAA-----GG------CAGGT----GTTGCTATGGAAG-- 929 Metarhizium_anisopliae CCGTCTGCAGAA-----AG------CCGGT----GTTGCTCTTGATG-- 940 Stachybotrys_echinata TCGCGCACAAAA-----GG------CTGGA----AACGCCGTGATCG-- 858 Peethambara_spirostriata CCGTGCACAGAA-----GG------CCGGA----AACCCCATCTTTG-- 927 Fusarium_virguliforme TCGCATGCAAAA-----GG------CTGGT----TACGTGATGGACG-- 718 Penicillium_chrysogenum TCGACAGCTCCA-----GG------CTGGC----TACGCTCTGCCAG-- 1116 Ophioparma_lapponica GAGGCAAATACA-----AG------CGGGT----TACGTGATTCCAG-- 973 Seynesia_erumpens TCGCCTAGCCAA-----GG------CTGGG---ATAGAATCCCAC-C-- 903 Daldinia_concentrica TCGGATGTCCAA-----A------CTCGG---ATATGACGTATCTC-- 924 Xylaria_hypoxylon TCCGATGTGTTA-----AA------TTAGGTTTACCGGATCCTTT-C-- 927 Botrytis_ricini CCAGCGGTAGAACTTGCGGCCTTCTTCTGATCACCCCAGTTACCTGTGGC 1007 *

Debaryomyces_hansenii ---AGAACA-AGAAGTTGACCCT------GCAAAGAGAATCA-AGCCTAC 1160 Candida_tropicalis ---AGAACA-AGAATTGGATCCA------GCAAAGAGAATTA-AACCAAC 1059 Protocrea_pallida ---ATGACATGGGCGATGACTTG------AACAAGCGATTGA-AAACAAA 983 Protocrea_farinosa ---ATGACATGGGCGATGACTTG------AACAAGCGATTGA-AAACAAA 969 Cladobotryum_cubitense ---ACGATATGGGCGACGATCCA------AACAAGCGTTTGA-AGACAAA 969 Hypocrea_voglmayrii ---AAGACATGGGAGATGATCCC------AACAAGCGTCTCA-AGACCAA 947 Trichoderma_ovalisporum ---AAGACATAGGAGATAATCCG------AACCAGCGTCTGA-AGACCAA 989 Hypocrea_lixii ---AAGACATGGGAGATGACCCG------AACAAGCGACTAA-AGACCAA 981 Arachnocrea_stipata ------ACATGGATGATGAT------931 Sphaerostilbella_aureonitens ------GTTAATG------946 Sporophagomyces_chrysostomus ---ATGACATGTTGGACGATCCA------AACAAGCGTCTCA-AGACGAA 969 Metarhizium_anisopliae ---ATGACATTGGAGATGACCTG------AATAAGCGTCTCA-AGACCAA 980 Stachybotrys_echinata ---AGGATACAAGTGACGATCCA------AACAAGCGACTCA-AGACCAA 898 Peethambara_spirostriata ---AGGATACGACCGACGATCCA------AACAAGAGGTTGA-AGACGAA 967 Fusarium_virguliforme ---ACGATAACACGGACGACCCG------AACAGGAGATTGA-AGACCAA 758 Penicillium_chrysogenum ---ATGACGAAACCAGCGACCCC------AACAAGCGTGTTC-GGTCGAT 1156 Ophioparma_lapponica ---AGAAC---ACTGATGATCCT------AATAAGCGTGTCA-AGGCACC 1010 Seynesia_erumpens ---AGGACAACACAGATAACCCC------AACAAGCGTATCA-GAACGAA 943 Daldinia_concentrica ---AGGATAACGGAGATGAGATT------AACAAGCGTCTCA-AGACTAA 964 Xylaria_hypoxylon ---A--ACAGCGAGGACACTTTCGCCCCAAACAAGCGGTTGA-AGACGAG 971 Botrytis_ricini CAAAGAATATTTCAGACCGTTCGTGATT-GTTGTGGATTTCACACCCAAA 1056

Debaryomyces_hansenii -ACTTGGATTA------CA------TACGTTTA 1180 Candida_tropicalis CACAAGCAGTAAT------ACTCA------CACATATA 1085 Protocrea_pallida GACGAACCCAACT------ACTCA------CATGTACA 1009 Protocrea_farinosa GACAAACCCAACT------ACTCA------CATGTACA 995 Cladobotryum_cubitense GACAAATCCCACA------ACCCA------CATGTACA 995 Hypocrea_voglmayrii GACGAATCCAACC------ACCCA------CATGTACA 973 Trichoderma_ovalisporum GACAAATCCAACA------ACTCA------CATGTATA 1015 Hypocrea_lixii GACCAACCCGACA------ACTCA------CATGTACA 1007 Arachnocrea_stipata -----ATC------934 Sphaerostilbella_aureonitens ------Sporophagomyces_chrysostomus GACCAACCCTACC------ACCCA------CATGTACA 995 Metarhizium_anisopliae GACCAACCCCACA------ACTCA------CATGTATA 1006 Stachybotrys_echinata GACCAATCCGACG------ACCCA------TATGTACA 924 Peethambara_spirostriata GACCAATCCAACA------ACTCA------CATGTATA 993 Fusarium_virguliforme AACCAACCCCACA------ACTCA------CATGTACA 784 73

Penicillium_chrysogenum TCTCAGCC-AGCGT------GCTCA------CACCTGGA 1182 Ophioparma_lapponica TATGAACCCAACA------GCTCA------TGCATGGA 1036 Seynesia_erumpens GATGAACCCAACC------ACGCA------TATGTACA 969 Daldinia_concentrica GTTGAATCCCACG------ACGCA------CATGTATA 990 Xylaria_hypoxylon GATAAACCCGACA------ACTCA------CATGTACA 997 Botrytis_ricini GTCAAATTGAACTCTCGGTTGTTTTCCACGCAACGTTGCAAGTATCTGTA 1106

______RPB1000_R (rev. c): CAT-TGYGA--RATTCACCC Debaryomyces_hansenii CCCAT-TGTGA--AATCCATCCTTCAATGATTCTAGGAG--TTGCAG--- 1222 Candida_tropicalis CCCAT-TGTGA--AATTCATCCTTCTATGATTTTGGGTG--TTGCTG--- 1127 Protocrea_pallida CGCAT-TGTGA--GATCCACCCTAGCATGATTCTTGGTA--TCTGCG--- 1051 Protocrea_farinosa CGCAT-TGCGA--GATTCACCCTAGCATGATTCTTGGTA--TCTGCG--- 1037 Cladobotryum_cubitense CTCAT-TGTGA--AATTCACCCCAGCATGATTCTCGGTA--TTTGCG--- 1037 Hypocrea_voglmayrii CGCAT-TGCGA--AATTCACCCGAGTATGATCTTAGGTA--TCTGTG--- 1015 Trichoderma_ovalisporum CTCAT-TGCGA--GATTCACCCGAGTATGATCTTAGGTA--TCTGTG--- 1057 Hypocrea_lixii CCCAT-TGCGA--GATTCACCCAAGTATGATCTTAGGCA--TCTGTG--- 1049 Arachnocrea_stipata ------Sphaerostilbella_aureonitens ------Sporophagomyces_chrysostomus CACAC-TGTGA--GATTCATCCTAGTATGATTCTTGGTA--TTTGCG--- 1037 Metarhizium_anisopliae CGCAT-TGTGA--AATTCACCCCAGTATGATTCTTGGTA--TTTGCG--- 1048 Stachybotrys_echinata CCCAC-TGCGA--GATTCACCCCAGTATGATCCTGGGTA--TTTGCG--- 966 Peethambara_spirostriata CGCAC-TGCGA--AATCCACCCTAGCATGATCCTAGGCA--TCTGTG--- 1035 Fusarium_virguliforme CTCAT-TGTGA--GATTCACCCCAGCATGATTCTGGGCA--TTTGCG--- 826 Penicillium_chrysogenum CGCAC-TGCGA--AATCCACCCTAGTATGATCCTGGGTG--TTTGCG--- 1224 Ophioparma_lapponica CCCAT-TGCGA--GATTCATCCAAGTATGATTCTGGGGA--TTTGTG--- 1078 Seynesia_erumpens CGCAT-TGCGA--AATTCACCCTAGTATGCTCCTGGGTA--TTTGTG--- 1011 Daldinia_concentrica CGCAT-TGCGA--GATTCATCCTAGCATGCTCCTGGGTA--TCTGCG--- 1032 Xylaria_hypoxylon CTCAT-TGTGA--AATTCACCCGAGCATGCTTCTTGGCA--TTTGTG--- 1039 Botrytis_ricini CACATCTGTTGTCAATCTA-CGGAATAGATTTCTGAACAACTTTGCGAGC 1155

Debaryomyces_hansenii ------CCTCT------ATTATTCCGTTCCCA---GATCATAACCAAT 1255 Candida_tropicalis ------CTTCT------ATTATTCCATTCCCA---GATCATAATCAAT 1160 Protocrea_pallida ------CCAGC------ATCATTCCCTTTCCT---GATCACAACCAGG 1084 Protocrea_farinosa ------CCAGC------ATCATCCCCTTCCCT---GATCACAACCAGG 1070 Cladobotryum_cubitense ------CAAGT------ATTATTCCCTTCCCA---GATCACAACCAGG 1070 Hypocrea_voglmayrii ------CTAGT------ATCATTCCTTTCCCC---GATCACAACCAAG 1048 Trichoderma_ovalisporum ------CCAGT------A------1063 Hypocrea_lixii ------CTAGT------ATCATTCCTTTCCC------1068 Arachnocrea_stipata ------Sphaerostilbella_aureonitens ------Sporophagomyces_chrysostomus ------CTAGT------ATCATTCCTTTCCCA---GATCATAACCAGG 1070 Metarhizium_anisopliae ------CTAGT------ATTATTCCATTCCCC---GATCACAATCAGG 1081 Stachybotrys_echinata ------CCAGC------ATCA------975 Peethambara_spirostriata ------CTAGC------ATTA------1044 Fusarium_virguliforme ------CCAGT------ATCATTCCTTTCCCC---GATCACAACCAGG 859 Penicillium_chrysogenum ------CTAGT------ATTATTCCGTTCCCG---GATCATAACCAG- 1256 Ophioparma_lapponica ------CAAGC------ATCATCCCCTTCCCA---GACCACACCCAGG 1111 Seynesia_erumpens ------CGAGC------ATTATCCCTTTCCCA---GATCATAACCAGG 1044 Daldinia_concentrica ------CCAGC------ATCATTCCGTTCCCG---GATCACAATCAG- 1064 Xylaria_hypoxylon ------CCAGT------ATCATCCCTTTCCCC---GATCATAATCAA- 1071 Botrytis_ricini AAAGGTCCAGCCAAATCCAAACGCTTCTTTCCGAAATGATCACGATCATC 1205

Debaryomyces_hansenii CG------CCTCGTAATACCTATCAATCTGCT 1281 Candida_tropicalis CA------CCACGTAATACTTATCAATCTGCT 1186 Protocrea_pallida TATGTTTGAAATCTGATAACTACGTTATATCTGTTTACTA------1124 Protocrea_farinosa TATGTT------1076 Cladobotryum_cubitense TATGTC------1076 Hypocrea_voglmayrii TATGTCGGCTTGACAATTAATCTCCTTTTCTGCGCCC------1085 Trichoderma_ovalisporum ------Hypocrea_lixii ------Arachnocrea_stipata ------Sphaerostilbella_aureonitens ------Sporophagomyces_chrysostomus TACGTG------1076 Metarhizium_anisopliae TAAGC------1086 Stachybotrys_echinata ------Peethambara_spirostriata ------Fusarium_virguliforme TATG------863 74

Penicillium_chrysogenum ------TCGCCG------CGTAACACC---TATCAGTCGGCC 1283 Ophioparma_lapponica TATGTTCTATCGATCGCTGTACCAAACATAGCACCAACTACAAATTAAGT 1161 Seynesia_erumpens TGTGCT------CCCCCTTCTATGTATTTGCCGAAT 1074 Daldinia_concentrica ------Xylaria_hypoxylon ------Botrytis_ricini ------

RPB7CR (rev. c): 5’-ATGGGTAA--GCAAGCTATGGG-3’ Debaryomyces_hansenii ATGGGTAA--GCAAGCTATGGGTGTCTTCTTAACTAACTATGCAGTGAGA 1329 Candida_tropicalis ATGGGTAA--GCAAGCTATGGGTGTTTTCTTGACCAATTATTCTGTGAGA 1234 Protocrea_pallida ------Protocrea_farinosa ------Cladobotryum_cubitense ------Hypocrea_voglmayrii ------Trichoderma_ovalisporum ------Hypocrea_lixii ------Arachnocrea_stipata ------Sphaerostilbella_aureonitens ------Sporophagomyces_chrysostomus ------Metarhizium_anisopliae ------Stachybotrys_echinata ------Peethambara_spirostriata ------Fusarium_virguliforme ------Penicillium_chrysogenum ATGGGCAA--ACAAGCCATGGGTGTGTTCTTGACGAACTTTTCACAGCG- 1330 Ophioparma_lapponica CTCCTCGA--AACACCTACCAGTCCGCTATGGGCAAACAAGCCATGGG-- 1207 Seynesia_erumpens GTTATTTATTATATGCTAACCTCGTACTCTAGTCACCC------1112 Daldinia_concentrica ------Xylaria_hypoxylon ------Botrytis_ricini ------

75

Appendix 2.2 Alignment of β-tubulin sequences from Sordariomycete species. Priming sites are indicated by shading.

Entoleuca_mammta -AACATGCGTGAGATTGTAAGT---GTTTG----TTTTTGTTTT-----T 37 Rosellinia_merrillii -AACATGCGTGAGATTGTAAGT---GTTTACCTTTTTTTATTTTCTCTAT 46 Nemania_macrocarpa -AACATGCGTGAGATTGTAAGT---ATT------24 Nemania_illita -AACATGCGTGAGATTGTGAGT---G------22 Amphirosellinia_fushanensis -AACATGCGTGAGATTGTAAGT---GTT------24 Astrocystis_bambusae -AACATGCGTGAGATTGTAAGT---ACT------24 Kretzschmaria_guyanensis -AACATGCGTGAGATTGTAAGT---GCTCT------26 Kretzschmaria_lucidula -AACATGCGTGAGATTGTAAGT---GCCCT------26 Penzigia_cantareirensis -AACATGCGTGAGATTGTAAGT---GCTCT------26 Stilbohypoxylon_elaeicola -AACATGCGTGAGATTGTAAGT---GTTCC------26 Discoxylaria_myrmecophila -AACATGCGTGAGATTGTAAGT---GCAGC------26 Podosordaria_mexicana -AACATGCGTGAGATTGTAAGTCCGGCTTGCG------GT 33 Whalleya_microplaca -AACATGCGTGAGATTGTAAGT------21 Creosphaeria_sassafras --ACCCGTGCGTTGCTTCTATC------20 Annulohypoxylon_cohaerens ------CCATC------5 Daldinia_concentrica TAAGTTATATCAAATTCTTGTT------22 Hypoxylon_investiens TAACATGCGTGAGATTGTAAGT------22 Theissenia_cinerea -AACATGCGTGAGATTGTAAGT------21 Mn_majus027 --GAATAAACTCGGCGGCAAAC------20 Mn_majus049 ------Xylaria_escharoidea -AACATGCGTGAGATTGTAAGT------21

Entoleuca_mammta GTTTCCTGCCTCG------CTGTTCGCCCGCAG----C 65 Rosellinia_merrillii TTTTCGAGCCTCG------CTGTTCGCCCGCAG----C 74 Nemania_macrocarpa -----GTGCCCC------CGCTCGGCCGTGGT-GTC 48 Nemania_illita -----GCGCTCC------TCCTCTGTCGT----CTC 43 Amphirosellinia_fushanensis ----CTTCCCCT------CCCCTGGCCCGCAC---GG 48 Astrocystis_bambusae ----CCTATGCTGA------TCCCTGGCCTGCAT---AC 50 Kretzschmaria_guyanensis ----CCCGTCTTT------CAGTCGTCTTC------46 Kretzschmaria_lucidula ----GCCGACCC------CAGCGGTCTTC------45 Penzigia_cantareirensis ----TCTATCTTC------GAGTGGTCTTCAA----- 48 Stilbohypoxylon_elaeicola ----TCCACCTCCACGGCCTGACCTGCACGGGCATCAATCATCAGCTCGC 72 Discoxylaria_myrmecophila TATCCCCGACCCTG------CTTCTGGCCCCATCATCC 58 Podosordaria_mexicana CTTCCCCAATCCCA------CTGCCATCTGCAA----- 60 Whalleya_microplaca ----CTTACCTC------CTACCTGTA------38 Creosphaeria_sassafras ----CACACAGCG------CCACTGAAA------38 Annulohypoxylon_cohaerens ----CCCA------9 Daldinia_concentrica ----TACCCCTA------ACCGAAG------37 Hypoxylon_investiens ----ATCATCTCG------TCTTATGTT------40 Theissenia_cinerea ------CAT------GACAGAAGA---- 33 Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea ----CTTG------GACTACG------32

Entoleuca_mammta CCACCCCCTCTGTTTACTT---TTGCAACCCAAT-CA-AGCTTCGTGA-- 108 Rosellinia_merrillii CCACCCCCTCTGTTTACTT---TTGCAACCCAAT-CA-AGCTTCGTGA-- 117 Nemania_macrocarpa CCTTCCCCTCTGTTTACTTTTTTTGCAACCCAAT-CA-ATATCGGGCA-- 94 Nemania_illita CCTTCCCCTCTGTTTACTT---TTGCAACCTAAT-CA-AGCTCCGTCA-- 86 Amphirosellinia_fushanensis GCTTCCCCTCTGTTTACTT---TTGCAACCCAATCAA-AGCTATGGCA-- 92 Astrocystis_bambusae CCTTCCCCTCTGTTTACTT---TTGCAACCCGATGCG-AGCTCCATCA-- 94 Kretzschmaria_guyanensis --CCCTCCTCTGTTTACTTT--T--CAACCCAAT-CA--GCTGTGTTG-- 85 Kretzschmaria_lucidula --CCC---TCTGTTTACTTT--TT-CAACCCAAT-CA-AGATCTGTCG-- 83 Penzigia_cantareirensis --CTCCCCTCTGTTTACATT--T--CAACCCAAA-CA-AGCTCCGTCGTC 90 Stilbohypoxylon_elaeicola CTCTCCCCTCTGTTTACTTT--TTGCAACCCAAT-CACAGCTCCGCCA-- 117 Discoxylaria_myrmecophila CCCTCTCTGCTGTTTACTT---TGGCAGCCCAAT-CATGGCCAGCTCA-- 102 Podosordaria_mexicana --TTCCCCTCTGTTTACTT---TTGCAACCCAACGAA-TACTCCACCA-- 102 Whalleya_microplaca ------C-CTGTTTACTT-----TCAGCCA------CCCACCGG-- 64 Creosphaeria_sassafras ------CGCTGCAGCATT-----GCAGCGG------CA----AG-- 61 Annulohypoxylon_cohaerens ------TGC------12 Daldinia_concentrica ------CCCAGTATCACT-----AGAGAAA------C------57 Hypoxylon_investiens ------TACTGCTGACCC-----GAACCCTGTC------CCCAATTG-- 70 76

Theissenia_cinerea ------GCCCCTTTTGCCTC----TCGACTC------CCTCCTCG-- 62 Mn_majus027 ------CCGTATCG-- 28 Mn_majus049 ------TCG-- 3 Xylaria_escharoidea ---TCCCCCCTGCTTGGATGGATATCGACGGGAC-CAAAACTTCTTAC-- 76

Entoleuca_mammta -ATTTTTTTTTGGGCTACCCTATCC---C-TGAAAACG-CGTCC------146 Rosellinia_merrillii -ATTTTTTTT-GGGCTACCCTATCC---C-CGAAAACG-CGTCT------154 Nemania_macrocarpa -ACTTTCT-----GCTACCCTATCA---C-TGGA--CG-CGTTC------125 Nemania_illita -ATTTTTTT----GCTACCCTATCC---C-TGAA--CG-CGTCC------118 Amphirosellinia_fushanensis -ATTTTTTTT---GCTACCCTATCT---C-TGAA--CA-CGTCCATCC-- 129 Astrocystis_bambusae -ATTCT---T---GCTACCTGATCC---C-TGAA--CG-CGTTTACTG-- 128 Kretzschmaria_guyanensis -GTATTCT-----GCAACCTTGGGC---C-AGGA--CG-CGTCC------116 Kretzschmaria_lucidula -ATATTCT-----GCTGCCCCAACC---C-AGGA--CG-CGTCC------114 Penzigia_cantareirensis GGTTTTCT-----GCTGCCATAACC---CCAGGA--CG-CGTCC------123 Stilbohypoxylon_elaeicola --TTTTTT-----GCTGCCCTATCC---C-TGAA--CG-CGTCC------147 Discoxylaria_myrmecophila AGTTTTTTT-----CTGCCTTATCC---C-TGAA--CG-CGCCC------134 Podosordaria_mexicana AATTTT------GCTGCCTTACATGAACTTGAACGCGTCGCCCCAGT-- 143 Whalleya_microplaca ------TTCCT------GAA--CG-CGTCCCAATCT 85 Creosphaeria_sassafras ------ACCCT------GAAT-CG-CGTCTCGGCTC 83 Annulohypoxylon_cohaerens ------CCCT------GAA--CG-CGTCCTAATCC 32 Daldinia_concentrica ------CCCT------GAA--CG-CGTCCGAA--- 74 Hypoxylon_investiens ------CCCCT------GAA--CG-CGTCCAAA--- 88 Theissenia_cinerea ------CCCTGCCC------GA--CACCCCCT------80 Mn_majus027 ------CC------GAAA-----ATCT------38 Mn_majus049 ------CC------GAAA-----ATCT------13 Xylaria_escharoidea ---TGTCTAACGGCCCGCCCCTTTC---TTCGAAA-CG-CCCCCATCGG- 117 * *

Entoleuca_mammta ---TAGACGCC--CCCGGCTTCGA---TCGTGGT--CTGTCCACA----- 181 Rosellinia_merrillii ---TGGACGCC--CCCGGCTACGA---TCGTGGTG-CCGTCCGCAA---- 191 Nemania_macrocarpa ---TGTACG-----CC--TCTTGA---CCTTGGC--CTCTCCACA----- 155 Nemania_illita ---CACCCGAT--ACC--CCTTGC---TATTAGT--CTGTCCACA----- 151 Amphirosellinia_fushanensis ---CCGATGTTGGGTCAATGCTGG---TCTCTGCACATCTCTACAACATT 173 Astrocystis_bambusae ---CAAATATTGAA-CGATGCCGG---CATCTCC--CTCTCCACGACA-- 167 Kretzschmaria_guyanensis ---CAGCCCTACAATCCTCCTTGA---CCTTAGC--ATCCCCATA----- 153 Kretzschmaria_lucidula ---CAACCCAGCACCGCTTCT-GA---CCTTCGC--CTCTCGGCG----- 150 Penzigia_cantareirensis ---CAAACCT-CTATGTCCCTTGA---TCTTAGC--CTCTCCACA----- 159 Stilbohypoxylon_elaeicola ---AACTTCCATTATGCTCTTGGA---TCCTAGC--CTCTCCACA----- 184 Discoxylaria_myrmecophila -----GCCA----ATGGTCCATGA---CATCAACA-CCCTTCCC------165 Podosordaria_mexicana ---AGCAAATCCAATAGGCCTCAC---TATCAGGG-CTCTCCAC------180 Whalleya_microplaca --CCAATTCCCC------CCTGA---T-TTATCTGCCCCTCCAC----- 117 Creosphaeria_sassafras AATCAATGCCCCATGA---CCTGA---A-TCCGCTTCTCCTCCCT----- 121 Annulohypoxylon_cohaerens TCCAAAAGCCCC------CTTGA---T-TT--CTGCCCCTCACG----- 64 Daldinia_concentrica ---AAAAACTCCAAAACATCCTTA---G-TT--CTACCCCTCATA----- 110 Hypoxylon_investiens ----AAATCTCCAAACC--CCTGA---T-TT-TCTGCCCCTCATG----- 122 Theissenia_cinerea ---CATCCCCCAGCAAGTACCCAG---ACGCGACGCCCCTCCACA----- 119 Mn_majus027 ------TCGG---TCGCAGAGGAATCTGGCA----- 60 Mn_majus049 ------TCGG---TCGCAGAGGAATCTGGCA----- 35 Xylaria_escharoidea ---CGCGCTTAATGACATGCATAAGCATCGTGGCGAATGTACCTA----- 159

Entoleuca_mammta -TCGATAC----GCAAG-CATCATG-----CGC--GATCACTGAT----- 213 Rosellinia_merrillii -TCGAGAC----GCAAGGCGTCATG-----CGC--GAGCACTGATT---- 225 Nemania_macrocarpa -TATACCC----GTCAG-CATCGTG-----AGA--AGTCAACACT----- 187 Nemania_illita -CCTATAC----ACGAG-CATCATC-----TGT--AAACAA------179 Amphirosellinia_fushanensis ATCATTTGCA--GTGGGCCATGCCG-----CAT--AGCTGCTA------207 Astrocystis_bambusae ATCAATTT-----TGAGTTTTGGCA-----CGC--AGTCGCCAGA----- 200 Kretzschmaria_guyanensis -TTTGCGTGAGCATCATGTGCGGTG---GAGG---ATAGGGCA------189 Kretzschmaria_lucidula -TCCGCACAACCATCGTCTGCAGTG---GACG---ATGTGGTG------186 Penzigia_cantareirensis -TCT------ATCTCCTCTT---GACG---AACTGGTG------184 Stilbohypoxylon_elaeicola -TCTATATGCAAGCAAGCTTCGATGTCAGACGTCAATGCTGCGCTATAGA 233 Discoxylaria_myrmecophila -TTCGCTGGTGCACCAG-CATCATC------AGCAGAA------195 Podosordaria_mexicana ------TCGTG---AACG---AAGGAGTG------197 Whalleya_microplaca ------ATCC------AACACGGCACTGCTG------136 Creosphaeria_sassafras ------CTCC------A-CACATATTCACTG------139 Annulohypoxylon_cohaerens ------CACAGA----AACAAAAAAACACCA------85 Daldinia_concentrica ------CAC------AAAACGCTACCATAT------128 Hypoxylon_investiens ------CACACAAAACAGCACAGCTTC-CTG------146 77

Theissenia_cinerea ------CACGCTCTGTC----ACGCACACATGACA------144 Mn_majus027 ------AAAGGGTG------68 Mn_majus049 ------AAAGGGAG------43 Xylaria_escharoidea ------TCCGTCAATGACGGGATTCCCCTTGA----- 185

Entoleuca_mammta -GCTATGATCGT-CGATGATG-CTACTTCGTTGGGTCCCC------250 Rosellinia_merrillii -GTCACGATCGT-CGATGATGGCCATTTCGTTGGGTCCTG------263 Nemania_macrocarpa -CTGGCGATGGTACAACGAGCACAGATTTACACAAACCAC------226 Nemania_illita ------ATGCGAATAGCACGGCTTCAGTCATTCCCC------209 Amphirosellinia_fushanensis ----CCAATCGCAC---AATTTCCTC-CTG-CCTATAGGG------238 Astrocystis_bambusae --TGCCGATTGCATCCAAACCTCCTCGTTGTCCCACATAA------238 Kretzschmaria_guyanensis GCCACCGATAGCACATGCACCTTGG--ATT--CAGCCTGC------225 Kretzschmaria_lucidula GCGACCAAGAGCTTACGG--CTTG---ATT--CCGCCTGC------219 Penzigia_cantareirensis GCTACCAATAGCATGCAGCTCT-----ATT--CGATCTGT------217 Stilbohypoxylon_elaeicola GCTGCCAATTACACG-GCTTCATGCAAATTACCAGGTTGCAGGGTTG--- 279 Discoxylaria_myrmecophila ------GGCCGCATAGCAGTTGCTA--GCTACCTTCTCTC------227 Podosordaria_mexicana -----CGAGTAACTGAGAAGCAACACCTCTTTCTGGCACT------232 Whalleya_microplaca -----CAGCTCTCTCTACCTGC--ACCTCGG-CATCGCATCAGTCAATTC 178 Creosphaeria_sassafras -----C--CCTTCGAGACTCG---ACCTCGA-CGTT------164 Annulohypoxylon_cohaerens -----CCATATTCCATCGTTGTTTACTTT-A-CGCCA------115 Daldinia_concentrica -----CTACATTTTATA-TTGC--AACT--A-CG------151 Hypoxylon_investiens -----CAGCTCTTT-TCGCTAC--GATTCAATCGATTC------176 Theissenia_cinerea --CGAGCACCACCCCATCCCGTGGCCCTCGACCGCGTCGG------182 Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea ---AGAAACCGAGCTTGGCTTCATGATTCACGCAGTCACC------222

Entoleuca_mammta ----TGGGACGTTAT--CGAA-----ACCGACT------GTTGCTTTA 281 Rosellinia_merrillii ----TCGGATGCTTCA-CGAA-----ACGGAAG------ATTGCTTTA 295 Nemania_macrocarpa ----TAGATTAGCTGC-GGAA-----ATTA------TTTATGA 253 Nemania_illita ----CGTATCATCCCG-AGAA-----ATCA------TGCATGA 236 Amphirosellinia_fushanensis ----TTTATTATCTA--CATT-----ATCTACAGAAGCTTGATAACATGA 277 Astrocystis_bambusae ----CA-ATTTCCTGAGCCTA-----ATCAACA------ACAAAGAGA 270 Kretzschmaria_guyanensis ----CCAGTTGCCTAGCAAAA----TAT------TTCACGA 252 Kretzschmaria_lucidula ----CAAGTTGTCTAGCACAA----TAT------TTGATGA 246 Penzigia_cantareirensis ----CAAGTTGCCTAGAAAAA----AATAG------TACAGGA 246 Stilbohypoxylon_elaeicola ---TCGGGCTGCCCAGAGACGCGTGAATCG------TGAATGA 313 Discoxylaria_myrmecophila ----CAGATACCCTTATGCAA----AATTT------TGAACGG 256 Podosordaria_mexicana ------TTGACCGTCGGGA------ACACAG 251 Whalleya_microplaca AATTCAAGCTTCTCGAGAATATGG------AT-GTGA 208 Creosphaeria_sassafras --TTCGAGTATCATATGAAAGCGA------ACCAAGT 193 Annulohypoxylon_cohaerens GCTTCAAAT-TTACACAATCGTG------ACATTCT 144 Daldinia_concentrica ---TCAAATTGTATACCAACGTG------TCGGGG 177 Hypoxylon_investiens -AATCAAACTGGACAACGATGAG------ATGA 202 Theissenia_cinerea GCTGTCAACAACCCGACCCGAC------AGGATGG 211 Mn_majus027 -----GAAATAAACAAGCAAG------84 Mn_majus049 -----GAAATAAACAAGCAAG------59 Xylaria_escharoidea ----CGGACGACGAGATGTTA------TTTTGT 245

Entoleuca_mammta GAGTTAA------GCTAACGA-TACCTTTCCCC----GTGTCT 313 Rosellinia_merrillii GAGTTAG------GCTAACGA-TGGGTTTCCCC---CATGTCT 328 Nemania_macrocarpa GAGCGAT------GCTAACTG-TGGTCTGTTC-----GTCGAT 284 Nemania_illita GAATCAG------ACTAACCA-TTGTTTACCT-----GTTTAC 267 Amphirosellinia_fushanensis GAATCGC------GCTAACCG-TGCTCCCCCC-----GTCTCT 308 Astrocystis_bambusae AAA-CGC------GCTAACCA-TGGTTTCCCC-----CTC--- 297 Kretzschmaria_guyanensis AAACTAG------ACTGACCA-TGCTCTTCCG-----ACCTGC 283 Kretzschmaria_lucidula AAACCAG------GCTAACCG-CGCTCTTCCC-----CTCTAC 277 Penzigia_cantareirensis TAATCAA------GCTAACAG-TGCTTCTCCA-----CTCTAC 277 Stilbohypoxylon_elaeicola GCATTAGAAATAGGAAGCTTGCTAACCG-GGCGTCTCC------CTCAAC 356 Discoxylaria_myrmecophila G------GCTAACCA-TACGCGACCC-----GACTAT 281 Podosordaria_mexicana ATTTTTG------GCTAACCA-AACTTTTCTC------TT 278 Whalleya_microplaca GAACCAC------AGCTAACCG-TGTCTTTTTTTTCGTCTGAAT 245 Creosphaeria_sassafras GAATCAC------AACTAACCG-C-TTTTTTCTCTCTTATCTAT 229 Annulohypoxylon_cohaerens CGATCAATA------AGCTAACCA-TATCTTTTTT--CATCCCAAT 181 Daldinia_concentrica AAATCAA------AGCTAACCG-CGTTTCTTC------AAT 205 Hypoxylon_investiens AAATCAC------AGCTAACAAACATTTGTCTC------T 230 78

Theissenia_cinerea AAAATGCAA------AGCTAACCCGTGTCTCGCTCT----CTCGAT 247 Mn_majus027 ------GCTAAC-ACTCTCTTCCCCG------ACAC 107 Mn_majus049 ------GCTAAC-ACTCTCTTCCCCG------ACAC 82 Xylaria_escharoidea ATACAAA------AGCTGACTTCTGTTTTTTTTCGCGATCGTGT 283 ** **

Entoleuca_mammta AGGTTCACCTCCAAACCGGC--CAATGCGTAAG---CTTC--CTTCGATG 356 Rosellinia_merrillii AGGTTCACCTCCAAACCGGC--CAATGCGTAAG---CTGC--CTTCGACG 371 Nemania_macrocarpa AGGTCCACCTCCAAACCGGC--CAATGCGTAAG---TCGC-TCTTCGATC 328 Nemania_illita AGGTCCACCTCCAAACCGGC--CAATGCGTAGG---TCGC-CCTTCGATC 311 Amphirosellinia_fushanensis AGGTTCACCTTCAGACCGGC--CAATGCGTAGG---TCAT-CCTTCAGTC 352 Astrocystis_bambusae AGGTTCACCTCCAGACCGGC--CAATGCGTAGG---TCAC-TGCTCGGAT 341 Kretzschmaria_guyanensis AGGTTCACCTCCAAACCGGT--CAATGCGTAAG---TCGC-TCTGCGATC 327 Kretzschmaria_lucidula AGGTTCACCTCCAAACCGGC--CAATGCGTAAG---TCGC-TCTGCGGCC 321 Penzigia_cantareirensis AGGTTCACCTCCAGACCGGC--CAATGCGTAAG---CTGC-CTTACGGTT 321 Stilbohypoxylon_elaeicola AGGTTCACCTCCAAACCGGC--CAATGCGTAGG---TCGC-CATCCGACC 400 Discoxylaria_myrmecophila AGGTTCACCTTCAAACCGGC--CAATGCGTAGG---TCGC-CCTTTGATG 325 Podosordaria_mexicana AGGTCCACCTTCAAACCGGC--CAATGCGTGAG---TGAT-AGCTCGCTT 322 Whalleya_microplaca AGGTTCATCTTCAGACCGGC--CAATGCGTAAGTAGTAAC-CTTCCGACC 292 Creosphaeria_sassafras AGGTTCATCTTCAGACCGGC--CAATGCGTAAG---TAAC-ATCCCGACA 273 Annulohypoxylon_cohaerens AGGTTCACCTCCAGACCGGC--CAATGCGTAAGTTTTATT-ACTTGGACT 228 Daldinia_concentrica AGGTTCATCTTCAGACTGGC--CAATGTGTAAG---TAAC-AG--CGATC 247 Hypoxylon_investiens AGGTTCACCTCCAGACCGGC--CAATGCGTAAGTA-CAACGATCACAACC 277 Theissenia_cinerea AGGTTCACCTCCAGACCGGC--CAATGCGTAAG----TGC-CTCCCAACC 290 Mn_majus027 AGG------CAAACCATCTCCAGTGAGCACG-----GT---CTCGACA 141 Mn_majus049 AGG------CAAACCATCTCCAGTGAGCACG-----GT---CTCGACA 116 Xylaria_escharoidea AGGTTCACCTTCAGACCGGC--CAATGCGTAAG---TCCTACTCGAGAAT 328 *** ** ** ** ** * *

Entoleuca_mammta C----TT------CGCG-ATGACGGCGAG---CA-TC------378 Rosellinia_merrillii C----CT------CGCG-ATGACGGCGAG---CA-TC------393 Nemania_macrocarpa CGCCGTC------CTCGCACGACGTCAAG---AG-TT------355 Nemania_illita C----TC------CACG-ACGATGTCGAG---AA-TT------333 Amphirosellinia_fushanensis C-----T------CATA-ACGATGTCGAG---AA-GC------373 Astrocystis_bambusae C-----C------CA----CGATGTCGAA---GA-CT------359 Kretzschmaria_guyanensis A----TT------GATG-CCGACGTCTAG---CA-TC------349 Kretzschmaria_lucidula C----CC------GATG-ACGACGTCTAG---AC-TC------343 Penzigia_cantareirensis C----TC------GATG-ACGATATTTAG---AA-TC------343 Stilbohypoxylon_elaeicola C----CC------CGTA-ACGACTTCGAG---AACTC------423 Discoxylaria_myrmecophila G----CC------GATTACCGGCGTCGAC---AG------346 Podosordaria_mexicana C------GTGA--TCGTG---GCATA------337 Whalleya_microplaca ----ACT------GA--CAATTGCGCTGGA---GG-AT------314 Creosphaeria_sassafras ----AGC------GAACCCTTTGCGCTGTG---AA-AC------297 Annulohypoxylon_cohaerens ----AACTATGGACGAG-ATATCGCGCTGGG---AT-ATTGTGATATAGG 269 Daldinia_concentrica ----ATC----GAAGAA-CCATGGATATATA---AG-AC------273 Hypoxylon_investiens G-CGACC------GAG-GCAACGCGCTAGA---AATAT------304 Theissenia_cinerea G----GC------AGAACACCCCCCATCGCGCTAGACTC------319 Mn_majus027 G------CAATGGCGTG------152 Mn_majus049 G------CAATGGCGTG------127 Xylaria_escharoidea C------ATGTCTCTATG---ATCGTT------346

Bt2a: GGTAACCAAATCGGTGC Entoleuca_mammta ----TCGTGG-ACTCACA-ATGAAATCAT--AGGGTAACCAAATCGGTGC 420 Rosellinia_merrillii ----TCGTGA-GCTCACATATGGAATCAT--AGGGTAACCAAATCGGTGC 436 Nemania_macrocarpa ----GCCCGAAGCTCACA--CAACACGAT--AGGGTAACCAAATTGGTGC 397 Nemania_illita ----CCCCGAGGCTCACA--CTTTCTAAT--AGGGTAACCAAATTGGTGC 375 Amphirosellinia_fushanensis ----TCCAGCAGCTCACA---TATATGAT--AGGGTAACCAAATTGGTGC 414 Astrocystis_bambusae ----TCCGACGACTCACA---TAACTTACC-AGGGTAACCAAATCGGTGC 401 Kretzschmaria_guyanensis ----TCCCAAGGCTCACA--TAGCATGATC-AGGGTAACCAAATCGGTGC 392 Kretzschmaria_lucidula ----TCCCGAGGCTCACA--TTACATAACC-AGGGTAACCAAGTCGGTGC 386 Penzigia_cantareirensis ----CTCCGACGCTCACA--TATAATCATC-AGGGTAACCAAATTGGTGC 386 Stilbohypoxylon_elaeicola ----CCCCGTGGCTCACA--TCGTATGAT--AGGGTAACCAAATTGGTGC 465 Discoxylaria_myrmecophila ----CCTTGGAGCTCACA--TGTTTTGAT--AGGGCAACCAAGTTGGTGC 388 Podosordaria_mexicana ----GTCGTAGACTAACA--CTAAATAAT--AGGGTAACCAAGTTGGTGC 379 Whalleya_microplaca -----AGCGGGGCTCACA-TAAAATT-AC--AGGGTAACCAAATCGGTGC 355 Creosphaeria_sassafras -----CGTGGGGCTCACA-CGATATTGAC--AGGGTAACCAAATTGGTGC 339 Annulohypoxylon_cohaerens AATATAGCGGGGCTAATA-TGAAGACGGT--AGGGTAACCAAATTGGTGC 316 Daldinia_concentrica ---ACAGCGGGGCTCACA-TGAAGATGAT--AGGGTAACCAAATCGGTGC 317 Hypoxylon_investiens -----AGAGGGGCTCACA-C-AAACTAAC--AGGGTAACCAAATCGGTGC 345 79

Theissenia_cinerea ----TGGCGTCGCTCACA-CGAGATTCGC--AGGGCAACCAAATTGGTGC 362 Mn_majus027 ------TAAGTTCA------ATAACCGACTCG---- 172 Mn_majus049 ------TAAGTTCA------ATAACCGACTCG---- 147 Xylaria_escharoidea ----TCCTGGGACTCATA-----TGTTATACAGGGCAACCAAATTGGTGC 387 * * **** * * * TGCTTTC Entoleuca_mammta TGCTTTCTGG------430 Rosellinia_merrillii TGCTTTCTGG------446 Nemania_macrocarpa TGCTTTCTGG------407 Nemania_illita TGCTTTCTGG------385 Amphirosellinia_fushanensis TGCTTTCTGG------424 Astrocystis_bambusae TGCTTTCTGG------411 Kretzschmaria_guyanensis CGCTTTCTGG------402 Kretzschmaria_lucidula TGCTTTCTGG------396 Penzigia_cantareirensis TGCTTTCTGG------396 Stilbohypoxylon_elaeicola TGCTTTCTGG------475 Discoxylaria_myrmecophila TGCTTTCTGG------398 Podosordaria_mexicana TGCGTTCTGG------389 Whalleya_microplaca TGCTTTCTGGTGCGTAAGCCTACC----TGCGAGCCAATCTGCGC---TG 398 Creosphaeria_sassafras TGCTTTCTGGTGCGTA--CCTACGAATATACGA-CTATTTCGCGTCCATG 386 Annulohypoxylon_cohaerens TGCTTTCTGG------326 Daldinia_concentrica CGCTTTCTGG------327 Hypoxylon_investiens TGCTTTCTGG------355 Theissenia_cinerea TGCTTTCTGG------372 Mn_majus027 CACTTCTTG------181 Mn_majus049 CACTTCTTG------156 Xylaria_escharoidea TGCTTTCTGG------397 * * **

Entoleuca_mammta ------Rosellinia_merrillii ------Nemania_macrocarpa ------Nemania_illita ------Amphirosellinia_fushanensis ------Astrocystis_bambusae ------Kretzschmaria_guyanensis ------Kretzschmaria_lucidula ------Penzigia_cantareirensis ------Stilbohypoxylon_elaeicola ------Discoxylaria_myrmecophila ------Podosordaria_mexicana ------Whalleya_microplaca GCAC---TGAAATAGCAGTCAAT-CATTGGTGCTGGGAATACTGATTTCA 444 Creosphaeria_sassafras ACGCGAATGAGAC-GCGATCGAGACAACGACGAGCAGGACATTGACGTC- 434 Annulohypoxylon_cohaerens ------Daldinia_concentrica ------Hypoxylon_investiens ------Theissenia_cinerea ------Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea ------

Entoleuca_mammta ------CAACAAATTTCGGGCGAGCACGGCCTTGACGGCAATGGCG 470 Rosellinia_merrillii ------CAACAAATTTCCGGCGAGCACGGCCTCGACGGCAATGGAG 486 Nemania_macrocarpa ------CAACAAATTTCAGGCGAGCACGGTCTCGACGGCAGTGGCG 447 Nemania_illita ------CAACAAATTTCCGGCGAACACGGTCTCGACGGCAATGGCG 425 Amphirosellinia_fushanensis ------CAACAAATCTCCGGCGAGCACGGTCTCGACGGCAATGGAG 464 Astrocystis_bambusae ------CAACAAATCTCCGGCGAGCACGGTCTCGACGGCAATGGAG 451 Kretzschmaria_guyanensis ------CAACAAATTTCTGGCGAGCACGGTCTCGATGGCAGCGGCG 442 Kretzschmaria_lucidula ------CAACAGATTTCCGGCGAGCACGGTCTCGATGGCAGTGGCG 436 Penzigia_cantareirensis ------CAGCAAATCTCTGGCGAGCACGGTCTTGATGGCAGCGGCG 436 Stilbohypoxylon_elaeicola ------CAGCAAATCTCGGGCGAGCATGGTCTCGATGGCAGCGGCG 515 Discoxylaria_myrmecophila ------CAACAGATTTCCGGCGAGCACGGTCTTGACGGCAATGGCG 438 Podosordaria_mexicana ------CAGCAGATCTCTGCGGAACACGGTCTCGACGGCAGTGGCG 429 Whalleya_microplaca TCAATATAGGCAAACCATCTCGGGCGAGCACGGTCTCGATACCAATGGCG 494 Creosphaeria_sassafras -TGATTTAGGCAAACCATCTCTGGCGAGCATGGCCTCGACAGCAACGGTG 483 Annulohypoxylon_cohaerens ------CAAACCATCTCTGGCGAGCACGGCCTCGACAGCAATGGCG 366 Daldinia_concentrica ------CAAACCATCTCTAGCGAGCACGGTCTCGACAGCAATGGAG 367 Hypoxylon_investiens ------CAAACCATTTCTGGCGAGCACGGTCTCGACAGCAATGGCG 395 80

Theissenia_cinerea ------CAGACCATCTCGGGCGAGCACGGTCTCGACAGCAATGGCG 412 Mn_majus027 ------CGAAA------GGC---CACTTCCCTGATGGC------204 Mn_majus049 ------CGAAA------GGC---CACTTCCCTGATGGC------179 Xylaria_escharoidea ------CAGACCATCTCGGGCGAGCATGGCCTTGATCCCACTGGTC 437 * ** * ** *

Entoleuca_mammta TGTATGTTC----T------TGAC-GGTCCCAT-TGAGT--CT-GGGGA- 504 Rosellinia_merrillii TGTATGTTC----T------TGAC-A-TCCCAT-TGAGT--TTTGGGGA- 520 Nemania_macrocarpa TGTATGTACACCAT------TTAC-TGTGTCCC-CCAAC--CCCGCGAA- 486 Nemania_illita TGTATGTCTAT-AG------CAAC-TATA------AAA--CCCA-GAG- 456 Amphirosellinia_fushanensis TGTATGTCT------TTG---GCGTCCG-TGGAA--CATGGGAA- 496 Astrocystis_bambusae TGTATGTCA------CAACGAACACTGG-TGGAT--TATGAAGAA 487 Kretzschmaria_guyanensis TGTATGTCCATGTCCCATGGCTAT-GGCGGCCG-TGAAC--AACCATGA- 487 Kretzschmaria_lucidula TGTATGTCCATCTC------TAT-GGCACCCG-TGAAA--AACGACGA- 474 Penzigia_cantareirensis TGTATGTCCA-GTC------GAT-GGCA-CCA-TGAAA--CGCATTAA- 472 Stilbohypoxylon_elaeicola TGTATGTCCT------T-GGCATCAA-TGGAACGAACGATGA- 549 Discoxylaria_myrmecophila TGTATGTAC------ACAGGACTTCG-GAAAG-----GCCAA- 468 Podosordaria_mexicana TGTATGTGC------ATTCGCGTCAA-CAGAAGCAACGCCGA- 464 Whalleya_microplaca TGTATGTTCC------CATGTTCCCTACACC-CCTATTTTATTTTGA- 534 Creosphaeria_sassafras TGTATGTAGC------TATAGCCCATACCCC-TTTACTTT----TGT- 519 Annulohypoxylon_cohaerens TGTAAGTA------TATGAGTCGT------CAATTCG---CAA- 394 Daldinia_concentrica TGTATGTA------TTCGAATTGT------TGATTCC---CAT- 395 Hypoxylon_investiens TGTAAGTA------TATGAGTTGT------CAATTCT---CGA- 423 Theissenia_cinerea TGTACGTATAT------CCAAGCCCGGCGACGGGTCGACGCTGCGGGGA- 455 Mn_majus027 -GTAT------CACGC-AGATGAAATACA---CAA- 228 Mn_majus049 -GTAT------CACGCCAGATGAAATACA---CAA- 204 Xylaria_escharoidea TGTAAGTATT------CGCCATTGACTTACCTGATAAATTATGCGAA- 478 ***

Entoleuca_mammta ----TACC--AGAATAATTGACTAATATGCGTGG---GACAGCTACAACG 545 Rosellinia_merrillii ----CAC----GAGCAATCGACTAATGTGTTTGG---TATAGCTACAACG 559 Nemania_macrocarpa ----TCCAACAAAACGGTCGATTAATATATACGG---AACAGCTACAATG 529 Nemania_illita ----TG-GGTAGGATTGTCAACTGACATCTATGG---ATTAGCTACAACG 498 Amphirosellinia_fushanensis ----TGACGC---AC-GTCAACTGACATGGATGG---AACAGCTACAACG 535 Astrocystis_bambusae AATGTGAGGCTTTACTGTTGGCTGACACGGTTCG---AACAGTTACAACG 534 Kretzschmaria_guyanensis ----CGG-----GATGGGCGACTGACAATCATGG---AACAGGTACAATG 525 Kretzschmaria_lucidula ----CAC-----AATGACCGACTAACCATGGTGG---AACAGGTACAATG 512 Penzigia_cantareirensis ----TGG-----AATGGCCGGTTAATCATTGTGG---G-TAGTTATAACG 509 Stilbohypoxylon_elaeicola ----CGAGCC--AATGGTCGACTGACATGAGCCG---AACAGGTACAATG 590 Discoxylaria_myrmecophila ----CACCGCGCAATGATGGACTGACATGTATGG---GACAGTTACAACG 511 Podosordaria_mexicana -----GAAGAGGATCAACTGACTAATTTGATGGA---AACAGGTACAATG 506 Whalleya_microplaca -----GCCTAGAGCCAAA-AACTGACCGCTCTTCAACAACAGCTACAATG 578 Creosphaeria_sassafras -----G--TATGACCCCA-GGCTTACCACCTC----CAACAGCTATAATG 557 Annulohypoxylon_cohaerens -----GGCCAAGGATATGCAACTGACGAACCAATA--AACAGCTACAACG 437 Daldinia_concentrica -----CGACGAGAATATC-AACTAATCATCCAT---CAACAGTTACAACG 436 Hypoxylon_investiens -----TGCCAAGAATGGC-AACTAACAACCAATA---ATTAGCTACAACG 464 Theissenia_cinerea -----CGGTAACGCTGACGGACTGGCCGGCG------GATAGCTACAATG 494 Mn_majus027 ------GT--ACTGACATCCTGTC---AATAGCTACAACG 257 Mn_majus049 ------GT--ACTGACATCCTGTC---AATAGCTACAACG 233 Xylaria_escharoidea ------TCGGTCGACTGACTGGCTGGA---AACAGCTACACCG 512 * ** ** * *

Btub526_F: CGAGCGYATGAGYGTYTACTT Entoleuca_mammta GAACGTCCGAGCTTCAGCTCGAGCGTATGAGCGTCTACTTCAACGAGGTA 595 Rosellinia_merrillii GAACCTCCGAGCTCCAGCTCGAGCGTATGAGCGTTTACTTCAACGAGGTA 609 Nemania_macrocarpa GCACCTCTGAGCTCCAGCTCGAGCGCATGAGCGTTTACTTCAACGAGGTA 579 Nemania_illita GAACCTCTGAGCTCCAGCTCGAGCGCATGAGCGTTTACTTCAACGAGGTA 548 Amphirosellinia_fushanensis GAACCTCGGAGCTCCAGTTGGAGCGCATGAGCGTTTACTTCAACGAGGTA 585 Astrocystis_bambusae GAACTTCCGAGCTTCAGCTCGAGCGCATGAGCGTTTACTTCAACGAGGTA 584 Kretzschmaria_guyanensis GAACCTCTGAGCTCCAGCTCGAGCGCATGAGCGTTTACTTCAACGAGGTA 575 Kretzschmaria_lucidula GAACCTCTGAGCTCCAGCTCGAGCGCATGAGCGTCTACTTCAACGAGGTA 562 Penzigia_cantareirensis GAACCTCTGAGCTCCAGCTCGAGCGTATGAGCGTATACTTCAACGAGGTA 559 Stilbohypoxylon_elaeicola GAACCTCCGAGCTCCAGCTTGAGCGCATGAGCGTCTACTTCAACGAGGTA 640 Discoxylaria_myrmecophila GAACCTCTGAGCTCCAGCTGGAGCGTATGAGCGTTTACTTCAACGAGGTA 561 Podosordaria_mexicana GAACTTCAGACCTCCAGCTCGAGCGCATGAGCGTCTATTTCAATGAGGTA 556 Whalleya_microplaca GCACCTCGGAGCTCCAGCTTGAGCGCATGAGTGTCTACTTCAATGAGGTA 628 Creosphaeria_sassafras GCACCTCCGAGCTCCAGCTGGAGCGGATGAAAGTGTACTTTAACGAGGTA 607 Annulohypoxylon_cohaerens GAACCTCTGAGCTCCAGCTTGAGCGCATGAGCGTCTACTTCAACGAGGTA 487 Daldinia_concentrica GTACTTCCGAGCTCCAGCTCGAGCGCATGAGCGTCTACTTCAACGAGGTA 486 81

Hypoxylon_investiens GAACCTCTGAGCTCCAGCTCGAGCGCATGAGCGTCTACTTCAACGAGGTA 514 Theissenia_cinerea GCACCTCGGAGCTCCAGCTCGAGCGTATGAGCGTCTACTTTAACGAGGTA 544 Mn_majus027 GCACCTCCGAGCTCCAGCTCGAGCGCATGAGTGTCTACTTCAATGAGGC- 306 Mn_majus049 GCACCTCCGAGCTCCAGCTCGAGCGCATGAGTGTCTACTTCAATGAGGC- 282 Xylaria_escharoidea GCTCCTCCGAGCTCCAGTTGGAGCGCATGAGTGTTTACTTCAATGAGGTA 562 * * ** ** ** *** * ***** **** ** ** ** ** ****

Entoleuca_mammta -CGCAAACCCAC----CA-GTTCATC------GGGAGTTGTGC--CATCG 631 Rosellinia_merrillii -TGCAAATCCAC----AG-GTTCATCATGGGGGGGAGTTGTGTGGCATCG 653 Nemania_macrocarpa -CGTCAACTCGC----CG-GTCAAATTCC---GACGGCTGAAT---GCCA 617 Nemania_illita -CACTGGCCTAG----CA-ATCCACCTAT---GAT--TTGGGA---ATCG 584 Amphirosellinia_fushanensis -GATA---CCAT----CGCACCCACTCCC------GTCGCATA-CACCA 619 Astrocystis_bambusae -GATG---CCGC----C--AGCTGCTT------GTCATCCG-CATCA 614 Kretzschmaria_guyanensis -CGCAGCCCTGC----CGACTCATGTTGCT------CC-CGTGCAACCA 612 Kretzschmaria_lucidula -CGCAGCCCTCC----CGCGTCATGTTGGT------CC-CATGCAACCA 599 Penzigia_cantareirensis -CGGAACCCTACTTACCGAGTCCTGTTGGT------CT-CGTATAGCCA 600 Stilbohypoxylon_elaeicola TCACAGCACT------GATCCACGTTTCT------TCATGCACGACGA 676 Discoxylaria_myrmecophila --GCAG-TCCAC--GACAAGTTTCCCTTG------CCGCATA--ACGA 596 Podosordaria_mexicana -TGCAGTTC------AGGTCAGTCTGTT------TCTTGCAGGA--A 588 Whalleya_microplaca ---CGATC--TG-----GATCTCGTGTGG------AATACTCAA--- 656 Creosphaeria_sassafras ATGTGTTT--GG-----GAACCAGCAGACCT------AATATCCGAG-- 641 Annulohypoxylon_cohaerens -CGCAATCCAAG-----AAACCGATTCAT------AGATGCGAGGA 521 Daldinia_concentrica -TGAATTTGTAG-----GAACTAG-GGATA------AATAAACGGAGA 521 Hypoxylon_investiens -TGCAAGAATCC-----AAGTCAG-GGATG------GAGATTCAAAAG 549 Theissenia_cinerea -TGTCGTTGCGG---CCGCGTAGATGGATGC------CATCACACGA 581 Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea -TGCCATATCGA------TGCCCTTCTCGT------CGTGGAAGATCG 597

Entoleuca_mammta ---ATATTCTAACGTGTTG--AATTTTT--AGGGTGCTGGCAACAAGTAT 674 Rosellinia_merrillii ---ATATTCTGACATGCTG--GATTTTT--AGGGTGCTGGCAACAAATAT 696 Nemania_macrocarpa ---ATATGCTAACGTGTTGA-AACTTTC--AGGGTGCCGGTAACAAGTAC 661 Nemania_illita ---AGATTCTAACGCAGTGG-AATGTTT--AGGGTGCTGGAAATAAATAT 628 Amphirosellinia_fushanensis ----CATTCTAACGCCTTGGGGGATTTT--AGGGTGCCAACAACAAGTAT 663 Astrocystis_bambusae ---ACGATCTAACTAGTTGACCA--TCT--AGGGTGCTAACAACAAGTAC 657 Kretzschmaria_guyanensis ---ACCTTCTAACGCG-----AGTTTTC-TAGGGTGCCGGTAACAAGTAT 653 Kretzschmaria_lucidula ---ACGTTTTAACATG-----ACATTTC-TAGGGTGCCGGTAACAAGTAT 640 Penzigia_cantareirensis ---ATTTTCTAACATG-----AGCTTTTGTAGGGTGCTGGCAACAAATAT 642 Stilbohypoxylon_elaeicola ---GCATGCTAATATGTCG--AATCATC-CAGGGTGCCGGCAACAAATAT 720 Discoxylaria_myrmecophila ---CCATTCTAACACTTTTCTAAGTAC---AGGGCGCCGGCAACAAATAT 640 Podosordaria_mexicana ---ATTTACTAATTTGAT---AATCGATACAGGGTGCTGGCAACAAATTC 632 Whalleya_microplaca -----TTTCTAATCC---CCCATTATGCACAGGCTTCTGGTAACAAATAT 698 Creosphaeria_sassafras -----TTTCTAACCTTGACCCGTGATGTA--GGCCTCTGGCAACAAATAT 684 Annulohypoxylon_cohaerens G-TAGTTACTAATCAC--CCCAACATACACAGGCATCTGGTAACAAGTAT 568 Daldinia_concentrica --CAATTGCTAATTGC--CTCAACGCGTGCAGGCTTCTGGCAACAAGTAT 567 Hypoxylon_investiens GTTAGCTACTAATTAC--CCTAACCTATACAGGCTTCCGGCAACAAGTAT 597 Theissenia_cinerea G--AGATGTTGACGTG---CCGCCCTTGACAGGCCTCTGGTAACAAGTAC 626 Mn_majus027 ------TTCCGGCAACAAGTAC 322 Mn_majus049 ------TTCCGGCAACAAGTAC 298 Xylaria_escharoidea C-CACTCGCTAACGTCG-GTTGGAATATCCAGGGCGCAAGCAGCAAGTAC 645 * * ** *

Entoleuca_mammta GTCCCCCGCGCCGTTCTCGTCGATTTGGAGCCCGGTACCATGGATGCTGT 724 Rosellinia_merrillii GTCCCCCGCGCCGTTCTTGTCGATTTGGAGCCCGGTACCATGGATGCCGT 746 Nemania_macrocarpa GTCCCTCGCGCCGTTCTCGTCGATCTGGAGCCCGGTACCATGGATGCCGT 711 Nemania_illita GTTCCTCGCGCCGTCCTCGTCGATTTGGAACCTGGTACTATGGATGCTGT 678 Amphirosellinia_fushanensis GTCCCTCGCGCAGTCCTAGTCGATTTGGAACCCGGTACCATGGATGCTGT 713 Astrocystis_bambusae GTCCCTCGTGCTGTCCTCGTCGACTTGGAGCCCGGTACCATGGACGCTGT 707 Kretzschmaria_guyanensis GTCCCTCGCGCCGTCCTCGTCGATTTGGAGCCCGGCACCATGGACGCTGT 703 Kretzschmaria_lucidula GTCCCTCGTGCCGTCCTCGTCGATTTGGAGCCTGGCACCATGGACGCTGT 690 Penzigia_cantareirensis GTCCCCCGCGCCGTCCTGGTCGATTTGGAGCCCGGTACCATGGATGCTGT 692 Stilbohypoxylon_elaeicola GTCCCTCGCGCCGTTCTCGTCGATCTGGAGCCCGGTACCATGGACGCCGT 770 Discoxylaria_myrmecophila GTTCCTCGTGCCGTCCTCGTCGATCTTGAGCCCGGTACCATGGACGCTGT 690 Podosordaria_mexicana GTCCCTCGCGCCGTTCTCGTCGATCTCGAGCCCGGTACCATGGACGCTGT 682 Whalleya_microplaca GTGCCTCGCGCTGTCCTCGTCGATCTCGAGCCCGGCACCATGGACGCCGT 748 Creosphaeria_sassafras GTTCCCCGCGCTGTCCTTGTCGATCTCGAGCCCGGTACCATGGATGCTGT 734 Annulohypoxylon_cohaerens GTTCCTCGCGCCGTCCTCGTCGACCTCGAGCCCGGCACCATGGACGCCGT 618 Daldinia_concentrica GTTCCTCGTGCCGTCCTCGTCGATCTCGAGCCCGGTACCATGGACGCCGT 617 82

Hypoxylon_investiens GTCCCTCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCATGGACGCTGT 647 Theissenia_cinerea GTGCCTCGCGCCGTGCTGGTCGATCTCGAGCCTGGCACCATGGACGCCGT 676 Mn_majus027 GTTCCTCGTGCCGTCCTTGTCGATCTCGAGCCCGGTACCATGGATGCCGT 372 Mn_majus049 GTTCCTCGTGCCGTCCTTGTCGATCTCGAGCCCGGTACCATGGATGCCGT 348 Xylaria_escharoidea GTCCCTCGTGCCGTTCTCGTCGATTTGGAGCCTGGCACCATGGATGCTGT 695 ** ** ** ** ** ** ***** * ** ** ** ** ***** ** **

Entoleuca_mammta CCGCGCCGGTCCCTTCGGTCAGCTCTTCCGACCCGACAACTTCGTCTTCG 774 Rosellinia_merrillii CCGCGCCGGTCCTTTCGGTCAGCTCTTCCGACCCGACAACTTCGTCTTCG 796 Nemania_macrocarpa CCGCGCTGGTCCCTTTGGCCAGCTCTTCCGACCCGACAACTTCGTCTTCG 761 Nemania_illita CCGTGCCGGTCCCTTCGGTCAACTCTTCCGACCTGACAACTTCGTTTTCG 728 Amphirosellinia_fushanensis CCGTGCCGGTCCCTTCGGCCAACTCTTCCGCCCCGACAACTTCGTCTTCG 763 Astrocystis_bambusae CCGCGCCGGTCCCTTTGGCCAGCTCTTCCGACCCGATAACTTCGTCTTCG 757 Kretzschmaria_guyanensis CCGTGCCGGTCCCTTCGGCCAGCTCTTCCGACCCGACAACTTTGTCTTCG 753 Kretzschmaria_lucidula CCGTGCTGGTCCCTTCGGTCAGCTCTTCCGACCCGACAACTTCGTCTTCG 740 Penzigia_cantareirensis CCGTGCTGGTCCCTTCGGTCAGCTCTTCCGACCCGACAACTTCGTCTTTG 742 Stilbohypoxylon_elaeicola CCGTGCGGGTCCTTTCGGTCAGCTCTTCCGACCCGATAACTTTGTTTTTG 820 Discoxylaria_myrmecophila CCGTGCAGGCCCCTTCGGCCAGCTCTTCCGGCCCGACAACTTCGTCTTCG 740 Podosordaria_mexicana CCGTGCTGGTCCCTTCGGTCAGCTCTTCCGACCCGACAACTTCGTCTTCG 732 Whalleya_microplaca CCGTGCCGGTCCCTTCGGTCAGCTCTTCCGTCCCGACAACTTCGTCTTCG 798 Creosphaeria_sassafras CCGTTCCGGTCCTTTTGGCCAGCTTTTCCGTCCCGACAACTTCGTCTTCG 784 Annulohypoxylon_cohaerens TCGCGCCGGTCCTTTCGGCCAGCTTTTCCGACCTGACAACTTCGTTTTCG 668 Daldinia_concentrica CCGTGCTGGTCCCTTCGGTCAGCTCTTCCGACCCGACAACTTCGTTTTCG 667 Hypoxylon_investiens CCGTGCTGGTCCCTTCGGCCAGCTTTTCCGACCCGACAACTTTGTCTTCG 697 Theissenia_cinerea CCGTGCTGGCCCCTTCGGTCAGCTCTTCCGGCCCGACAACTTCGTCTTCG 726 Mn_majus027 CCGTGCTGGTCCCTTCGGCCAGCTGTTCCGCCCCGACAACTTCGTCTTCG 422 Mn_majus049 CCGTGCTGGTCCCTTCGGCCAGCTGTTCCGCCCCGACAACTTCGTCTTCG 398 Xylaria_escharoidea CCGTGCCGGTCCTTTCGGCGAGCTCTTCCGTCCCGACAACTTCGTCTTCG 745 ** * ** ** ** ** * ** ***** ** ** ***** ** ** *

Entoleuca_mammta GTCAGTCAGGTGCTGGCAACAACTGGGCCAAGGGTCACTACACGGAGGGT 824 Rosellinia_merrillii GCCAGTCAGGTGCTGGCAACAACTGGGCCAAGGGTCACTACACTGAGGGT 846 Nemania_macrocarpa GCCAGTCTGGTGCCGGCAACAACTGGGCCAAGGGTCATTACACTGAGGGT 811 Nemania_illita GTCAGTCGGGTGCTGGCAACAACTGGGCCAAGGGCCATTACACTGAGGGT 778 Amphirosellinia_fushanensis GCCAGTCTGGTGCTGGCAACAACTGGGCCAAGGGACATTACACCGAGGGT 813 Astrocystis_bambusae GCCAGTCCGGTGCTGGAAACAACTGGGCCAAGGGTCACTACACTGAGGGT 807 Kretzschmaria_guyanensis GCCAGTCTGGTGCTGGCAACAACTGGGCCAAGGGTCACTACACTGAGGGT 803 Kretzschmaria_lucidula GCCAGTCTGGTGCTGGCAACAACTGGGCCAAGGGTCACTACACTGAGGGT 790 Penzigia_cantareirensis GCCAGTCTGGTGCTGGAAACAACTGGGCGAAGGGCCACTACACCGAGGGT 792 Stilbohypoxylon_elaeicola GCCAGTCCGGTGCTGGCAACAACTGGGCCAAGGGTCACTACACTGAGGGT 870 Discoxylaria_myrmecophila GCCAGTCCGGTGCCGGCAACAACTGGGCCAAGGGTCACTACACCGAGGGT 790 Podosordaria_mexicana GCCAGTCTGGTGCCGGCAACAACTGGGCCAAGGGCCATTACACTGAGGGT 782 Whalleya_microplaca GCCAGTCTGGTGCTGGCAACAACTGGGCGAAGGGTCACTACACCGAGGGT 848 Creosphaeria_sassafras GCCAGTCCGGTGCCGG-AACAACTGGGCCAGGGGTCACTACACTGAGGGT 833 Annulohypoxylon_cohaerens GCCAGTCTGGTGCCGGAAACAACTGGGCCAAGGGTCATTACACCGAGGGT 718 Daldinia_concentrica GTCAGTCCGGTGCCGGAAACAACTGGGCCAAGGGTCATTACACTGAGGGT 717 Hypoxylon_investiens GCCAGTCTGGTGCCGGAAACAACTGGGCCAAGGGTCACTACACTGAGGGT 747 Theissenia_cinerea GCCAGTCGGGCGCCGGCAACAACTGGGCCAAGGGTCACTACACGGAGGGC 776 Mn_majus027 GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCACTACACTGAGGGT 472 Mn_majus049 GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCACTACACTGAGGGT 448 Xylaria_escharoidea GTCAATCCGGTGCCGGCAACAACTGGGCCAAGGGTCACTACACTGAAGGA 795 * ** ** ** ** ** *********** * *** ** ***** ** **

Entoleuca_mammta GCCGAGCTGGTCGACACCGTTCTCGATGTCGTCCGTCGCGAGGCTGAGGG 874 Rosellinia_merrillii GCCGAGCTGGTCGACCAGGTTCTCGATGTCGTCCGTCGCGAGGCTGAGGG 896 Nemania_macrocarpa GCTGAGCTGGTCGACAACGTTCTCGATGTTGTCCGCCGTGAGGCTGAGGG 861 Nemania_illita GCTGAGCTCGTTGACCAAGTTCTCGATGTCGTTCGTCGCGAGGCTGAGGG 828 Amphirosellinia_fushanensis GCTGAGCTGGTTGACAACGTTCTGGATGTTGTCCGTCGCGAGGCTGAGGG 863 Astrocystis_bambusae GCCGAGTTGGTCGACAACGTTCTCGATGTCGTCCGTCGCGAGGCCGAGGG 857 Kretzschmaria_guyanensis GCTGAGCTTGTCGACCAGGTCCTGGATGTCGTTCGTCGCGAGGCTGAGGG 853 Kretzschmaria_lucidula GCTGAGCTTGTCGACAACGTTCTCGACGTTGTTCGTCGCGAGGCCGAGGG 840 Penzigia_cantareirensis GCCGAGCTCGTTGACAACGTCCTTGACGTTGTCCGTCGTGAGGCCGAGGG 842 Stilbohypoxylon_elaeicola GCTGAGCTCGTCGACCAAGTCCTCGACGTTGTCCGTCGTGAGGCCGAGGG 920 Discoxylaria_myrmecophila GCTGAACTAGTCGACAACGTTCTCGACGTCGTCCGTCGCGAGGCCGAGGG 840 Podosordaria_mexicana GCTGAGTTGGTGGACCAGGTTCTCGACGTCGTCCGCCGTGAGGCTGAAGG 832 Whalleya_microplaca GCCGAGCTGGTCGACCAGGTTCTCGATGTCGTTCGTCGCGAGGCTGAAGG 898 Creosphaeria_sassafras GCCGAGTTAGTTGACAACGTTCTTGATGTCGTCCGTCGTGAGGCTGAGGG 883 Annulohypoxylon_cohaerens GCTGAGCTTGTCGACCAGGTTCTTGATGTCGTTCGTCGCGAGGCTGAGGG 768 Daldinia_concentrica GCTGAGTTGGTTGACCAAGTCCTCGATGTCGTTCGTCGTGAGGCTGAAGG 767 83

Hypoxylon_investiens GCTGAGCTTGTTGACAACGTCCTTGATGTCGTCCGTCGTGAGGCTGAGGG 797 Theissenia_cinerea GCTGAGCTCGTCGACCAGGTCCTCGACGTCGTCCGCCGCGAGGCTGAGGG 826 Mn_majus027 GCCGAGCTTGTCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGG 522 Mn_majus049 GCCGAGCTTGTCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGG 498 Xylaria_escharoidea GCTGAATTGGTCGACCAGGTCCTCGATGTCGTCCGCCGTGAGGCTGAGGG 845 ** ** * ** *** ** ** ** ** ** ** ** ***** ** **

Entoleuca_mammta CTGCGACTGCCTTCAAGGTTTCCAGATCACTCACTCCCTCGGCGGTGGTA 924 Rosellinia_merrillii CTGCGACTGCCTTCAAGGTTTCCAGATCACTCACTCCCTCGGTGGCGGTA 946 Nemania_macrocarpa CTGCGATTGCCTCCAGGGCTTCCAGATCACCCACTCGCTCGGTGGTGGTA 911 Nemania_illita CTGCGACTGCCTTCAGGGTTTCCAGATCACCCACTCGCTCGGTGGTGGTA 878 Amphirosellinia_fushanensis CTGCGACTGCCTCCAGGGCTTCCAGATCACCCACTCCCTCGGTGGTGGTA 913 Astrocystis_bambusae CTGCGATTGCCTCCAAGGTTTCCAGATCACCCACTCGCTCGGTGGTGGTA 907 Kretzschmaria_guyanensis CTGCGACTGCCTCCAGGGTTTCCAAATCACCCACTCGCTCGGTGGTGGTA 903 Kretzschmaria_lucidula CTGCGACTGCCTCCAGGGCTTCCAAATCACCCACTCGCTCGGTGGCGGCA 890 Penzigia_cantareirensis CTGCGACTGCCTCCAGGGTTTCCAGATCACTCACTCGCTCGGTGGTGGTA 892 Stilbohypoxylon_elaeicola CTGCGACTGCCTCCAGGGTTTCCAGATCACCCACTCGCTTGGTGGCGGTA 970 Discoxylaria_myrmecophila CTGTGACTGCCTTCAGGGCTTCCAGATCACACACTCTCTCGGTGGTGGTA 890 Podosordaria_mexicana CTGCGACTGCCTGCAGGGTTTCCAGATCACCCACTCTCTCGGTGGTGGAA 882 Whalleya_microplaca CTGCGACTGCTTC-AGGGTTTCCAGATTACCCACTCTCTCGGCGGTGGTA 947 Creosphaeria_sassafras CTGTGACTGCCTCCAGGGTTTCCAGATCACCCACTCTCTCGGTGGTGGTA 933 Annulohypoxylon_cohaerens ATGTGATTGCCTCCAGGGTTTCCAGATCACCCACTCCCTCGGTGGTGGTA 818 Daldinia_concentrica CTGTGACTGCCTCCAGGGTTTCCAGATCACCCACTCCCTCGGTGGTGGTA 817 Hypoxylon_investiens TTGCGACTGCCTCCAGGGTTTCCAGATCACCCATTCTCTCGGTGGTGGTA 847 Theissenia_cinerea CTGCGACTGCCTCCAGGGCTTCCAGATCACCCACTCCCTCGGCGGTGGTA 876 Mn_majus027 CTGCGACTGCCTTCAGGGTTTCCAGATCACCCACTCTCTCGGTGGTGGTA 572 Mn_majus049 CTGCGACTGCCTTCAGGGTTTCCAGATCACCCACTCTCTCGGTGGTGGTA 548 Xylaria_escharoidea CTGTGACTGCCTCCAGGGCTTCCAGATCACCCACTCTCTCGGTGGTGGTA 895 ** ** *** * * ** ***** ** ** ** ** ** ** ** ** *

Entoleuca_mammta CTGGTGCCGGTATGGGTACGCTGTTGATCTCCAAGATCCGCGAA------968 Rosellinia_merrillii CTGGTGCCGGTATGGGTACGCTGCTGATCTCCAAGATCCGCGAG------990 Nemania_macrocarpa CTGGTGCCGGTATGGGTACGCTATTGATCTCCAAGATTCGCGAG------955 Nemania_illita CTGGTGCCGGTATGGGTACTCTGCTAATCTCCAAGATTCGCGAG------922 Amphirosellinia_fushanensis CCGGTGCCGGTATGGGTACGCTGCTGATCTCCAAGATCCGCGAG------957 Astrocystis_bambusae CCGGTGCCGGTATGGGTACGCTGTTGATCTCCAAGATCCGCGAG------951 Kretzschmaria_guyanensis CCGGTGCCGGTATGGGTACTCTGTTGATCTCCAAGATTCGCGAG------947 Kretzschmaria_lucidula CCGGTGCCGGTATGGGTACTCTGCTGATCTCCAAGATCCGCGAG------934 Penzigia_cantareirensis CTGGTGCCGGTATGGGTACTCTGCTGATCTCCAAGATCCGCGAG------936 Stilbohypoxylon_elaeicola CCGGTGCCGGTATGGGTACGCTGTTGATCTCCAAGATCCGTGAG------1014 Discoxylaria_myrmecophila CTGGTGCCGGTATGGGTACGCTGTTGATCTCCAAGATCCGTGAG------934 Podosordaria_mexicana CTGGTGCCGGTATGGGTACTCTGTTGATCTCCAAAATCCGCGAG------926 Whalleya_microplaca CTGGTGCTGGTATGGGTACCTTGTTGATCTCCAAGATCCGTGAG------991 Creosphaeria_sassafras CCGGTGCCGGTATGGGTACCCTGCTGATCTCCAAGATCCGCGAA------977 Annulohypoxylon_cohaerens CCGGTGCCGGTATGGGAACCTTGTTAATCTCCAAGATCCGCGAG------862 Daldinia_concentrica CTGGTGCCGGTATGGGTACTCTGTTGATCTCCAAGATCCGCGAG------861 Hypoxylon_investiens CTGGTGCCGGTATGGGTACTCTCCTGATCTCCAAGATCCGTGAG------891 Theissenia_cinerea CCGGTGCCGGTATGGGTACCCTGCTGATTTCCAAGATCCGCGAG------920 Mn_majus027 CCGGTGCCGGTATGGGTACGCTCCTCATCTCCAAGATCCGCGA------615 Mn_majus049 CCGGTGCCGGTATGGGTACGCTCCTCATCTCCAAGATCCGCGA------591 Xylaria_escharoidea CCGGTTCCGGTATGGGAACGTTGCTCATCTCCAAGATCCGAGAAGGTCCG 945 * *** * ******** ** * * ** ***** ** ** **

Entoleuca_mammta ------Rosellinia_merrillii ------Nemania_macrocarpa ------Nemania_illita ------Amphirosellinia_fushanensis ------Astrocystis_bambusae ------Kretzschmaria_guyanensis ------Kretzschmaria_lucidula ------Penzigia_cantareirensis ------Stilbohypoxylon_elaeicola ------Discoxylaria_myrmecophila ------Podosordaria_mexicana ------Whalleya_microplaca ------Creosphaeria_sassafras ------Annulohypoxylon_cohaerens ------84

Daldinia_concentrica ------Hypoxylon_investiens ------Theissenia_cinerea ------Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea TTGCTCCCCGTGTGTCCGCTCCTAGAGCTTCGACTGATATATACCCCTTA 995

Entoleuca_mammta --GAGTTCCCCGACCGCATGATGGCTACCTTCTCTGTTATGCCCTCTCCC 1016 Rosellinia_merrillii --GAGTTCCCCGATCGTATGATGGCTACCTTCTCTGTTATGCCCTCCCCC 1038 Nemania_macrocarpa --GAGTTCCCCGACCGCATGATGGCCACCTTCTCTGTCATGCCCTCTCCT 1003 Nemania_illita --GAGTTCCCTGACCGCATGATGGCCACCTTCTCCGTTATGCCCTCTCCC 970 Amphirosellinia_fushanensis --GAGTTCCCGGACCGCATGATGGCTACCTTCTCTGTCATGCCCTCTCCC 1005 Astrocystis_bambusae --GAGTTCCCCGACCGCATGATGGCTACCTTCTCCGTCATGCCCTCTCCT 999 Kretzschmaria_guyanensis --GAATTCCCCGACCGTATGATGGCCACCTTCTCTGTCATGCCCTCTCCC 995 Kretzschmaria_lucidula --GAGTTCCCTGACCGCATGATGGCCACCTTCTCCGTCATGCCCTCGCCC 982 Penzigia_cantareirensis --GAGTTCCCTGACCGCATGATGGCCACCTTCTCCGTTATGCCTTCTCCT 984 Stilbohypoxylon_elaeicola --GAGTTCCCTGACCGCATGATGGCCACCTTCTCCGTCATGCCCTCACCC 1062 Discoxylaria_myrmecophila --GAGTTCCCCGACCGCATGATGGCCACCTTCTCCGTCATGCCCTCTCCT 982 Podosordaria_mexicana --GAGTTCCCCGACCGCATGATGGCAACTTTCTCCGTCATGCCTTCCCCC 974 Whalleya_microplaca --GAGTTTCCCGACCGCATGATGGCTACCTTCTCCGTCGTCCCCTCCCCC 1039 Creosphaeria_sassafras --GAGTTCCCCGACCGTATGATGGCTACCTTCTCCGTCGTCCCCTCCCCC 1025 Annulohypoxylon_cohaerens --GAGTTCCCCGACCGAATGATGGCTACCTTCTCCGTCGTTCCCTCTCCC 910 Daldinia_concentrica --GAGTTCCCCGACCGCATGATGGCCACCTTCTCCGTCATGCCCTCCCCT 909 Hypoxylon_investiens --GAGTTCCCCGACCGCATGATGGCCACCTTCTCCGTTGTGCCTTCCCCT 939 Theissenia_cinerea --GAGTTCCCGGACCGCATGATGGCCACCTTCTCCGTCGTCCCCTCCCCC 968 Mn_majus027 --GAGTTCCCTGACCGTATGATGGCCACCTTCTCGGTTGTCCCCTCGCCC 663 Mn_majus049 --GAGTTCCCTGACCGTATGATGGCCACCTTCTCGGTTGTCCCCTCGCCC 639 Xylaria_escharoidea CAGAGTTCCCCGACCGCATGATGGCCACTTTCAGTGTTGTTCCCTCCCCC 1045 ** ** ** ** ** ******** ** *** ** * ** ** **

Entoleuca_mammta AAGGTCTCCGACACCGTTGTTGAGCCCTACAACGCCACCCTCTCCGTCCA 1066 Rosellinia_merrillii AAGGTCTCCGACACCGTTGTTGAGCCCTACAACGCCACCCTCTCCGTCCA 1088 Nemania_macrocarpa AAGGTCTCGGACACTGTTGTCGAGCCCTACAACGCCACCCTCTCCGTCCA 1053 Nemania_illita AAGGTCTCGGATACCGTTGTCGAGCCCTACAACGCCACTCTCTCAGTCCA 1020 Amphirosellinia_fushanensis AAGGTCTCAGACACCGTCGTCGAGCCCTACAACGCTACCCTCTCCGTCCA 1055 Astrocystis_bambusae AAGGTTTCCGACACCGTCGTCGAGCCCTACAACGCCACCCTCTCCGTACA 1049 Kretzschmaria_guyanensis AAGGTTTCGGACACCGTCGTCGAGCCTTACAACGCCACCCTCTCCGTCCA 1045 Kretzschmaria_lucidula AAGGTTTCGGACACCGTCGTCGAGCCTTACAATGCCACGCTCTCCGTCCA 1032 Penzigia_cantareirensis AAGGTTTCCGACACCGTTGTCGAGCCTTACAATGCCACTCTCTCCGTCCA 1034 Stilbohypoxylon_elaeicola AAGGTCTCCGACACCGTTGTCGAGCCCTACAACGCCACCCTCTCCGTCCA 1112 Discoxylaria_myrmecophila AAGGTCTCCGACACCGTCGTCGAGCCCTACAACGCCACACTCTCCGTCCA 1032 Podosordaria_mexicana AAGGTCTCCGACACCGTCGTCGAGCCTTATAACGCCACACTCTCGATCCA 1024 Whalleya_microplaca AAGGTCTCTGACACCGTCGTTGAGCCTTACAACGCCACCCTGTCCGTCCA 1089 Creosphaeria_sassafras AAGGTTTCCGACACCGTCGTCGAGCCTTACAACGCCACTCTGTCAGTTCA 1075 Annulohypoxylon_cohaerens AAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCCGTCCA 960 Daldinia_concentrica AAGGTTTCCGATACCGTTGTCGAGCCTTACAACGCCACCCTCTCCGTCCA 959 Hypoxylon_investiens AAGGTTTCCGACACTGTCGTTGAACCCTACAACGCCACTCTCTCGGTCCA 989 Theissenia_cinerea AAGGTCTCCGACACCGTTGTCGAGCCTTACAATGCTACCCTCTCCGTCCA 1018 Mn_majus027 AAGGTCTCCGACACCGTCGTCGAGCCTTACA-CGCCACCCTGTCCATCCA 712 Mn_majus049 AAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCCATCCA 689 Xylaria_escharoidea AAGGTGTCCGATACCGTCGTCGAACCCTACAACGCCACCCTTTCGGTTCA 1095 ***** ** ** ** ** ** ** ** ** * ** ** ** ** * **

Entoleuca_mammta CCAGCTGGTCGAGAACTCCGACGAGACCTTCTGTATCGATAACGAAGCTC 1116 Rosellinia_merrillii CCAGTTGGTTGAGAACTCCGACGAGACCTTCTGTATCGATAACGAAGCTC 1138 Nemania_macrocarpa CCAGCTGGTCGAGAACTCAGACGAGACCTTCTGCATTGACAATGAGGCTC 1103 Nemania_illita CCAGCTGGTTGAGAACTCCGACGAGACCTTCTGTATCGACAACGAGGCTC 1070 Amphirosellinia_fushanensis CCAGCTGGTCGAGAACTCAGACGAGACCTTCTGTATCGACAATGAGGCTC 1105 Astrocystis_bambusae CCAGCTTGTCGAGAACTCGGACGAGACCTTCTGCATTGACAACGAGGCCT 1099 Kretzschmaria_guyanensis CCAGTTGGTCGAGAACTCCGATGAAACCTTCTGCATTGATAACGAGGCGC 1095 Kretzschmaria_lucidula CCAGTTGGTCGAGAACTCAGATGAGACCTTCTGCATTGATAACGAGGCTC 1082 Penzigia_cantareirensis CCAGCTGGTCGAGAACTCCGACGAGACCTTCTGCATTGATAACGAGGCTT 1084 Stilbohypoxylon_elaeicola CCAGCTGGTCGAGAACTCCGATGAGACCTTCTGTATTGACAACGAGGCTC 1162 Discoxylaria_myrmecophila CCAGCTTGTCGAGAACTCCGACGAGACCTTCTGTATCGACAACGAGGCCC 1082 Podosordaria_mexicana CCAGCTGGTCGAGAACTCGGATGAGACCTTCTGCATTGACAACGAGGCCT 1074 Whalleya_microplaca CCAGCTGGTCGAGAACTCGGACGAAACCTTCTGTATCGATAACGAGGCCC 1139 Creosphaeria_sassafras CCAGCTGGTTGAGAACTCCGACGAGACCTTCTGTATCGACAACGAAGCTC 1125 Annulohypoxylon_cohaerens TCAGCTGGTCGAGAACTCGGACGAGACCTTCTGCATCGACAACGAGGCTC 1010 85

Daldinia_concentrica CCAGCTGGTCGAGAACTCCGATGAGACCTTCTGTATCGATAACGAGGCTC 1009 Hypoxylon_investiens CCAGCTGGTCGAGAACTCCGACGAGACCTTCTGCATTGACAACGAGGCTC 1039 Theissenia_cinerea CCAGCTGGTCGAGAACTCTGACGAGACTTTCTGTATCGATAACGAGGCCC 1068 Mn_majus027 CCAGCTTGTCGAGAACTCGGACGAGACTTTCTGCATTGACAACGAG-CTC 761 Mn_majus049 CCAGCTTGTCGAGAACTCGGACGAGACTTTCTGCATTGACAACGAGGCTC 739 Xylaria_escharoidea CCAGCTGGTTGAAAACTCCGACCAGACTTTCTGTATCGATAACGAGGCGC 1145 *** * ** ** ***** ** * ** ***** ** ** ** ** *

Entoleuca_mammta TGTACGATATCTGCATGCGTACACTGAAGCTATCCAACCCCTCATACGGT 1166 Rosellinia_merrillii TGTACGATATCTGCATGCGTACACTGAAGCTATCCAACCCCTCGTATGGC 1188 Nemania_macrocarpa TGTACGACATTTGCATGCGTACCCTGAAGCTATCCAACCCCTCATACGGT 1153 Nemania_illita TATACGACATCTGCATGCGCACCCTAAAGCTATCCAACCCCTCGTACGGT 1120 Amphirosellinia_fushanensis TATACGATATTTGCATGCGTACCCTGAAGCTATCCAACCCCTCATACGGT 1155 Astrocystis_bambusae TGTACGATATCTGCATGCGCACCCTGAAGCTGTCCAACCCCTCATACGGT 1149 Kretzschmaria_guyanensis TGTACGATATCTGCATGCGCACCTTGAAGCTATCCAACCCTTCATACGGT 1145 Kretzschmaria_lucidula TCTACGATATCTGCATGCGCACCTTGAAGCTCTCCAACCCTTCATACGGT 1132 Penzigia_cantareirensis TGTACGATATCTGCATGCGTACTTTGAAGCTATCCAACCCTTCATACGGT 1134 Stilbohypoxylon_elaeicola TTTACGATATCTGCATGCGTACCTTGAAGCTATCCAACCCCTCGTACGGC 1212 Discoxylaria_myrmecophila TATACGACATCTGCATGCGTACCCTGAAGCTGTCCAACCCCTCGTACGGC 1132 Podosordaria_mexicana TGTACGACATCTGCATGCGCACCCTGAAGTTGTCCAACCCCTCGTACGGC 1124 Whalleya_microplaca TGTACGACATCTGCATGCGTACCCTCAAGTTGTCCAACCCCTCCTACGGC 1189 Creosphaeria_sassafras TGTACGATATCTGCATGCGTACTCTCAAGCTGTCCAACCCCTCTTACGGC 1175 Annulohypoxylon_cohaerens TCTACGACATCTGCATGCGCACGTTGAAGCTGTCTAACCCTTCGTACGGT 1060 Daldinia_concentrica TGTACGACATCTGCATGCGCACGCTAAAGCTGTCCAACCCCTCGTACGGT 1059 Hypoxylon_investiens TGTACGACATCTGCATGCGTACCCTGAAGCTATCCAACCCCTCGTACGGT 1089 Theissenia_cinerea TGTACGACATTTGCATGCGCACTTTGAAGCTGTCGAACCCCTCGTATGGC 1118 Mn_majus027 TGTACGACATCTGCATGCGTACCCTGAAGCTCTCCAACCC-TCATACGGG 810 Mn_majus049 TGTACGACATCTGCATGCGTACCCTGAAGCTCTCCAACCCCTCATACGGC 789 Xylaria_escharoidea TCTACGATATCTGCCAACGCACCCTGAAGCTCACCAACCCCTCATACGGT 1195 * ***** ** *** ** ** * *** * * ***** ** ** **

Entoleuca_mammta GACCTGAACCACCTGGTCTCCGCCGTCATGTCTGGTGTCACCACCTGCCT 1216 Rosellinia_merrillii GACCTGAACCACCTCGTCTCCGCCGTTATGTCCGGTGTCACCACCTGCTT 1238 Nemania_macrocarpa GACCTGAACCACCTTGTCTCCGCCGTCATGTCCGGTGTCACGACCTGCCT 1203 Nemania_illita GATCTGAACCACCTCGTCTCTGCTGTCATGTCTGGTGTCACCACATGCCT 1170 Amphirosellinia_fushanensis GACTTGAACCACCTGGTCTCCGCCGTCATGTCTGGCGTCACTACTTGCCT 1205 Astrocystis_bambusae GACCTGAACCACCTCGTGTCCGCCGTCATGTCCGGCGTCACAACTTGCCT 1199 Kretzschmaria_guyanensis GACTTGAACCACCTTGTCTCCGCTGTCATGTCCGGCGTGACCACCTGCCT 1195 Kretzschmaria_lucidula GACTTGAACCACCTTGTCTCAGCTGTCATGTCTGGCGTGACCACCTGCCT 1182 Penzigia_cantareirensis GACCTGAACCACCTTGTCTCTGCCGTCATGTCTGGCGTGACCACCTGCCT 1184 Stilbohypoxylon_elaeicola GACCTGAACCACCTGGTCTCCGCTGTCATGTCTGGCGTTACCACCTGCCT 1262 Discoxylaria_myrmecophila GATCTCAACCACCTCGTCTCCGCTGTCATGTCCGGTGTTACCACCTGCCT 1182 Podosordaria_mexicana GATCTGAATCACCTCGTCTCCGCTGTCATGTCTGGCGTCACCACCTGCCT 1174 Whalleya_microplaca GACCTGAACCACCTGGTCTCTGCTGTCATGTCCGGTGTTACTACCTGCCT 1239 Creosphaeria_sassafras GACCTTAACCACCTGGTCTCTGCCGTCATGTCTGGTGTCACCACTTGCTT 1225 Annulohypoxylon_cohaerens GACCTGAACCACCTGGTCTCCGCCGTCATGTCTGGTGTCACCACCTGCTT 1110 Daldinia_concentrica GACCTGAACCACCTGGTCTCCGCCGTCATGTCCGGCGTTACTACTTGCTT 1109 Hypoxylon_investiens GACCTGAACCACCTAGTCTCTGCCGTCATGTCCGGTGTTACCACCTGTTT 1139 Theissenia_cinerea GATCTCAACCACCTCGTCTCCGCTGTTATGTCCGGCGTCACTACCTGCCT 1168 Mn_majus027 GACCTGAACTACCTCGTCTCTGCGG-CATGTCGGG-GTCACC--CTGCCT 856 Mn_majus049 GACCTGAACTACCTCGTCTCTGCCGTCATGTCCGG-GTCACCACCTGCCT 838 Xylaria_escharoidea GACTTGAACCACCTTGTGTCTGCCGTCATGTCGGGTGTTTCCACCTCCCT 1245 ** * ** **** ** ** ** * ***** ** ** * * *

Entoleuca_mammta GCGTTTCCCT-GGTCAGCTTAACTCTGACCTGCGCAAGTTGGCTGTGAAC 1265 Rosellinia_merrillii GCGTTTCCCT-GGTCAGCTTAACTCTGACCTACGCAAGTTGGCCGTGAAC 1287 Nemania_macrocarpa GCGTTTCCCT-GGTCAACTTAACTCTGACCTGCGCAAGTTGGCTGTGAAC 1252 Nemania_illita GCGTTTCCCT-GGTCAACTTAACTCTGATCTGCGCAAGTTGGCTGTGAAC 1219 Amphirosellinia_fushanensis GCGTTTCCCT-GGTCAGCTTAACTCTGACCTGCGCAAGTTGGCCGTCAAC 1254 Astrocystis_bambusae GCGTTTCCCT-GGTCAGCTTAACTCTGACCTGCGCAAGCTGGCCGTCAAC 1248 Kretzschmaria_guyanensis GCGTTTCCCC-GGTCAGCTTAACTCTGATCTGCGCAAGTTGGCCGTGAAC 1244 Kretzschmaria_lucidula GCGTTTCCCC-GGTCAGCTTAACTCTGATCTGCGCAAGTTGGCCGTGAAC 1231 Penzigia_cantareirensis GCGTTTCCCC-GGTCAGCTTAACTCCGACCTACGCAAATTGGCCGTGAAC 1233 Stilbohypoxylon_elaeicola CCGTTTCCCA-GGTCAGCTGAACTCTGATCTGCGTAAATTGGCCGTCAAC 1311 Discoxylaria_myrmecophila GCGTTTCCCT-GGTCAGCTCAACTCTGATCTGCGCAAGTTGGCCGTGAAC 1231 Podosordaria_mexicana GCGCTTCCCC-GGTCAGCTGAACTCTGACCTGCGCAAGCTGGCCGTCAAC 1223 Whalleya_microplaca GCGTTTCCCT-GGTCAGCTGAACTCTGACCTGCGCAAGCTGGCTGTCAAC 1288 Creosphaeria_sassafras GCGTTTCCCT-GGTCAGCTTAACTCTGACTTGCGCAAGCTTGCCGTCAAC 1274 Annulohypoxylon_cohaerens GCGTTTCCCC-GGCCAGCTGAACTCTGACCTGCGCAAACTCGCCGTGAAC 1159 86

Daldinia_concentrica GCGTTTCCCT-GGTCAGCTGAACTCTGACCTGCGCAAGCTTGCCGTGAAC 1158 Hypoxylon_investiens GCGCTTCCCT-GGTCAGCTGAACTCTGACTTGCGCAAGCTTGCCGTGAAC 1188 Theissenia_cinerea GCGCTTCCCC-GGTCAGCTGAACTCTGACTTGCGCAAGCTTGCCGTGAAC 1217 Mn_majus027 GCGTTTCCCG------866 Mn_majus049 GCGTTTCCCCCGGCCAGCTCAACTCTGACCTGCGCAAG------876 Xylaria_escharoidea CCGTTTCCCC-GGCCAGCTCAACTCCGACTTGCGTAAGCTAGCCGTCAAC 1294 ** *****

Entoleuca_mammta ATGGT------1270 Rosellinia_merrillii ATGGT------1292 Nemania_macrocarpa ATGGT------1257 Nemania_illita ATGGT------1224 Amphirosellinia_fushanensis ATGGT------1259 Astrocystis_bambusae ATGGT------1253 Kretzschmaria_guyanensis ATGGT------1249 Kretzschmaria_lucidula ATGGT------1236 Penzigia_cantareirensis ATGGT------1238 Stilbohypoxylon_elaeicola ATGGT------1316 Discoxylaria_myrmecophila ATGGT------1236 Podosordaria_mexicana ATGGT------1228 Whalleya_microplaca ATGGT------1293 Creosphaeria_sassafras ATGGT------1279 Annulohypoxylon_cohaerens ATGGT------1164 Daldinia_concentrica ATGGT------1163 Hypoxylon_investiens ATGGT------1193 Theissenia_cinerea ATGGT------1222 Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea ATGGTAAGCTATCCCTAAATCGCATGGTATCCCGTGTCTCAGCCAAATGA 1344

Entoleuca_mammta ------GCCGTTCCCCCGTCTGCACTTCTTCATGG 1299 Rosellinia_merrillii ------GCCGTTCCCTCGTCTGCACTTTTTCATGG 1321 Nemania_macrocarpa ------TCCCTTCCCTCGTCTGCACTTCTTCATGG 1286 Nemania_illita ------GCCCTTCCCTCGTCTGCACTTCTTCATGG 1253 Amphirosellinia_fushanensis ------GCCATTCCCTCGTCTGCACTTCTTCATGG 1288 Astrocystis_bambusae ------GCCATTCCCTCGTCTGCACTTCTTCATGG 1282 Kretzschmaria_guyanensis ------GCCCTTCCCTCGTCTGCACTTCTTCATGG 1278 Kretzschmaria_lucidula ------GCCCTTCCCACGTCTGCACTTCTTCATGG 1265 Penzigia_cantareirensis ------GCCCTTCCCCCGTCTGCACTTCTTCATGG 1267 Stilbohypoxylon_elaeicola ------GCCCTTCCCTCGTCTGCACTTCTTCATGG 1345 Discoxylaria_myrmecophila ------TCCTTTCCCTCGTCTCCATTTCTTCATGG 1265 Podosordaria_mexicana ------GCCTTTCCCTCGTCTCCACTTCTTCATGA 1257 Whalleya_microplaca ------TCCCTTCCCTCGTCTCCACTTCTTCATGG 1322 Creosphaeria_sassafras ------TCCTTTCCCTCGTCTTCACTTCTTCATGG 1308 Annulohypoxylon_cohaerens ------TCCTTTCCCCCGTCTCCATTTCTTCATGG 1193 Daldinia_concentrica ------TCCTTTCCCTCGTCTCCATTTCTTCATGG 1192 Hypoxylon_investiens ------TCCTTTCCCTCGTCTCCACTTCTTCATGG 1222 Theissenia_cinerea ------GCCCTTCCCTCGTCTGCACTTCTTCATGG 1251 Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea TGCTTACCGCTCTGAACAGGTTCCCTTCCCTCGTCTCCACTTCTTCATGG 1394

Entoleuca_mammta TCGGCTTTGCTCCTCTTACCAGCCGTGGTGCCCACTCTTTCCGTGCCGTC 1349 Rosellinia_merrillii TCGGCTTTGCTCCTCTGACCAGCCGTGGTGCCCACTCTTTCCGTGCCGTC 1371 Nemania_macrocarpa TCGGCTTCGCCCCTCTCACCAGCCGCGGTGCCCACTCCTTCCGCGCTGTC 1336 Nemania_illita TCGGCTTTGCTCCCCTCACCAGCCGTGGTGCGCACTCTTTCCGCGCTGTC 1303 Amphirosellinia_fushanensis TCGGCTTTGCCCCTCTTACTAGTCGTGGTGCTCACTCTTTCCGTGCTGTG 1338 Astrocystis_bambusae TTGGCTTCGCACCTCTGACTAGCCGTGGTGCTGGTGCTTTCCGTGCCGTC 1332 Kretzschmaria_guyanensis TCGGCTTTGCTCCCCTCACCAGCCGAGGCGCCCACTCCTTCCGCGCCGTC 1328 Kretzschmaria_lucidula TCGGCTTTGCTCCTCTCACCAGCCGTGGCGCTTACTCTTTCCGCGCCGTC 1315 Penzigia_cantareirensis TGGGCTTTGCTCCTCTCACCAGCCGTGGTGCTCACTCCTTCCGTGCCGTC 1317 Stilbohypoxylon_elaeicola TCGGCTTTGCTCCCCTTACCAGCCGTGGTGCTCACTCTTTCCGGGCTGTC 1395 Discoxylaria_myrmecophila TTGGCTTCGCTCCTCTCACTAGTCGTGGTGCCTATTCTTTCCGTGCTGTC 1315 Podosordaria_mexicana TTGGCTTCGCCCCCCTCACAAGTCGTGGCGCGTACTCTTTCCGTGCTGTC 1307 Whalleya_microplaca TTGGCTTCGCTCCCTTGACGAGCCGTGGTGCTCACTCTTTCCGAGCTGTC 1372 Creosphaeria_sassafras TCGGCTTCGCTCCCTTGACCAGCCGTGGCGCCTACTCCTTCCGCGCTGTC 1358 Annulohypoxylon_cohaerens TCGGCTTTGCTCCCCTGACCAGCCGTGGCGCCTACTCTTTCCGTGCCGTT 1243 87

Daldinia_concentrica TTGGCTTCGCTCCCCTGACCAGCCGTGGCGCTCACTCTTTCCGTGCCGTC 1242 Hypoxylon_investiens TCGGCTTCGCTCCTTTGACCAGCCGTGGTGCTCACTCCTTCCGTGCCGTT 1272 Theissenia_cinerea TCGGGTTTGCACCCCTGACCAGCCGCGGTGCTCACTCTTTCCGCGCCGTC 1301 Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea TCGGATTTGCTCCCCTGACCAGCCGCGGTGCCCACTCCTTCCGCGCCGTT 1444

Btub1332_R (rev. c) TTYGACYCYAAGAACATGA Entoleuca_mammta ACGGTTCCTGAGTTGACTCAACAAATGTTCGACCCCAAGAACATGATGGC 1399 Rosellinia_merrillii ACGGTTCCTGAGTTGACTCAACAGATGTTCGACCCCAAGAACATGATGGC 1421 Nemania_macrocarpa ACCGTTCCCGAGTTGACCCAGCAAATGTTCGACCCCAAGAACATGATGGC 1386 Nemania_illita ACCGTTCCTGAATTGACTCAGCAAATGTTCGACCCCAAGAACATGATGGC 1353 Amphirosellinia_fushanensis ACGGTCCCCGAGCTGACGCAGCAAATGTTCGACCCCAAGAACATGATGGC 1388 Astrocystis_bambusae TCGGTCCCCGAGCTGACCCAGCAAATGTTCGACCCCAAGAACATGATGGC 1382 Kretzschmaria_guyanensis ACCGTGCCCGAGTTGACCCAGCAGATGTTTGACCCCAAGAACATGATGGC 1378 Kretzschmaria_lucidula ACCGTGCCCGAGCTGACCCAGCAAATGTTCGACTCCAAGAACATGATGGC 1365 Penzigia_cantareirensis ACGGTTCCTGAGCTGACTCAGCAAATGTTCGACCCCAAGAACATGATGGC 1367 Stilbohypoxylon_elaeicola ACTGTTCCGGAATTGACTCAGCAGATGTTCGACCCTAAGAACATGATGGC 1445 Discoxylaria_myrmecophila ACGGTCCCCGAGTTGACTCAGCAAATGTTCGACCCCAAGAACATGATGGC 1365 Podosordaria_mexicana ACTGTCCCCGACTTGACACAGCAGATGTTCGACCCCAAGAACATGATGGC 1357 Whalleya_microplaca ACCGTCCCCGAGTTGACGCAGCAGATGTTCGACCCCAAGAACATGATGGC 1422 Creosphaeria_sassafras ACCGTTCCCGAGTTGACCCAGCAGATGTTCGACCCCAAGAACATGATGGC 1408 Annulohypoxylon_cohaerens ACCGTTCCCGAGTTAACTCAGCAGATGTTTGACCCCAAGAACATGATGGC 1293 Daldinia_concentrica ACCGTCCCTGAGTTGACTCAGCAGATGTTCGACCCCAAGAACATGATGGC 1292 Hypoxylon_investiens ACTGTTCCCGAGTTAACTCAGCAAATGTTCGACCCCAAGAACATGATGGC 1322 Theissenia_cinerea ACCGTTCCCGAGTTGACCCAGCAGATGTTCGACCCCAAGAACATGATGGC 1351 Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea ACTGTGCCCGAGCTCACGCAGCAACTCTTCGATCCCAAGAACATGATGAC 1494

Entoleuca_mammta CGCCGCTGACTTCCGCAACGGTCGCTACCTAACATGCTCTGCCATCCTGT 1449 Rosellinia_merrillii TGCCGCCGACTTCCGCAACGGTCGTTACCTAACATGCTCTGCCATCTTGT 1471 Nemania_macrocarpa TGCTGCTGATTTCCGCAACGGTCGCTACCTAACATGCTCTGCCATTTTGT 1436 Nemania_illita CGCCGCCGATTTCCGTAACGGTCGTTACCTCACATGCTCTGCCATTTTGT 1403 Amphirosellinia_fushanensis TGCCGCTGACTTCCGCAACGGTCGTTACCTGACCTGCTCTGCTATCTTGT 1438 Astrocystis_bambusae TGCTGCTGACTTCCGTAATGGTCGTTACCTGACCTGCTCTGCTATCTTGT 1432 Kretzschmaria_guyanensis CGCCGCCGACTTCCGTAACGGTCGTTACCTCACATGCTCTGCCATTTTGT 1428 Kretzschmaria_lucidula CGCTGCTGACTTCCGTAACGGTCGCTACCTCACATGCTCTGCTATTTTGT 1415 Penzigia_cantareirensis TGCTGCTGACTTCCGTAACGGTCGTTACCTCACATGCTCTGCTATCTTGT 1417 Stilbohypoxylon_elaeicola TGCTGCTGACTTCCGTAACGGGCGCTACCTGACATGCTCTGCCATCTTGT 1495 Discoxylaria_myrmecophila TGCAGCTGATTTCCGTAATGGCCGCTACCTCACATGCTCTGCCATCTTGT 1415 Podosordaria_mexicana AGCTGCTGACTTCCGTAACGGTCGTTATCTCACATGCTCCGCGATCTTGT 1407 Whalleya_microplaca TGCCTCTGACTTCAGAAATGGTCGTTACCTAACATGTTCAGCAATCTTGT 1472 Creosphaeria_sassafras TGCTTCTGACTTCCGCAACGGTCGTTACCTAACGTGCTCTGCCATCTTGT 1458 Annulohypoxylon_cohaerens TGCTTCTGACTTCCGCAACGGTCGCTACCTGACGTGCTCTGCCATCTTGT 1343 Daldinia_concentrica TGCTTCTGACTTCCGTAACGGTCGCTACCTGACGTGCTCAGCCATCTTGT 1342 Hypoxylon_investiens TGCTTCCGACTTCCGTAACGGTCGTTACCTGACTTGCTCTGCCATCTTGT 1372 Theissenia_cinerea TGCTTCGGATTTCCGCAACGGTCGATATCTGACGTGTTCTGCCATCTTGT 1401 Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea CGGCTCCGACTTCCGCAACGGCCGATACTTGACTTGCTCAGCCATCTT-- 1542

Entoleuca_mammta AAGA-----ATCTAACCCT--TCCCCC--TCG---ACACCA---TCCG-- 1482 Rosellinia_merrillii AAGT-----GTTTAGTCCC--TTTCCC--GTA---AACTTG---TTAG-- 1504 Nemania_macrocarpa AAGG-----ACCTATTTCC--TTTCCCCTTCG---TTATGC---TCGG-- 1471 Nemania_illita AAGA-----ACCAACT-CT--TTCCCGCTGTG---ACATGA---CTAA-- 1437 Amphirosellinia_fushanensis AAGA-----AACCACCACCTTTCCCCCTTCCA---GCGCTA---CAAA-- 1475 Astrocystis_bambusae AAGA-----TAT-ATCG----TTCTCTATATG---GCATGT---TATG-- 1464 Kretzschmaria_guyanensis AAGA-----CTTTTCTCGCGTGCCCTTCCA------AGTAA---CCAA-- 1462 Kretzschmaria_lucidula AAGG-----ACTCACTCTTGTACCTTTCTACA---CAGTCAA-TTCAA-- 1454 Penzigia_cantareirensis AAGA-----ACTTTCTCTTTTACCTTAGGACC---TGGT------CAA-- 1451 Stilbohypoxylon_elaeicola AAGA-----GTCCCCCCC----CCCTGCAAGC-----ATGAG--CCCA-- 1527 Discoxylaria_myrmecophila AAGG-----ATTCTTCCCTCATAATCTTGTGG---AAACTA---CGGA-- 1452 Podosordaria_mexicana AAGT-----TTTCACTTTTTTGCTCCCCTTCT---GACTGGC-TTCAGCG 1448 Whalleya_microplaca AAGG-----ATCTCTT-ATACTTTGACCT------CTATTGGTACTATC- 1509 Creosphaeria_sassafras AAG------ATAACAT-CAAGATGATCCT------CTATAG--AATATC- 1492 Annulohypoxylon_cohaerens AAG------ATAAAATACAATACGTTCCTACCGGTCTGTGATACCTAAT- 1386 88

Daldinia_concentrica ATGAT----ATTCCCTTTAAATT--TCATA------CATTG---CTTGT- 1376 Hypoxylon_investiens AAGAT----ATCCAACTAAAATCGTCCAT------TAATTG---CTAAT- 1408 Theissenia_cinerea AAGCCTCGTGTTCACTCTCCTTCCTCTCAAACCCCCTGTGGGCCCGGG-- 1449 Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea ------

Entoleuca_mammta ------TTACTAA------TCTAACATT--TCAATAGCCGTGGTAAGG 1516 Rosellinia_merrillii ------TTGCTAA------TCCGACTCG--TCAATAGCCGTGGTAAGG 1538 Nemania_macrocarpa ------TTACTAA------TTTTGCA----TGACCAGCCGTGGCAAGG 1503 Nemania_illita ------TTGCTAA------CCTTGCACTC-TAAATAGCCGTGGCAAGG 1472 Amphirosellinia_fushanensis ------TCGCTAA------CTGAATATTC-TCGCTAGCCGTGGTAAGG 1510 Astrocystis_bambusae ------TTGCTAA------CC--ATGAAC-TCTGTAGCCGTGGCAAGG 1497 Kretzschmaria_guyanensis ------TTGCTAA------CGTGATACGTTTCACCAGCCGTGGCAAGG 1498 Kretzschmaria_lucidula ------TTGCTAAT------CATGATACGCTTT-TTAGCCGTGGCAAGG 1490 Penzigia_cantareirensis ------TTGCTAA------TTTGATACTTTTCATCAGCCGTGGCAAGG 1487 Stilbohypoxylon_elaeicola ------TTACTAA------CTTGA--CGCTCTACTAGCCGTGGCAAGG 1561 Discoxylaria_myrmecophila ------TTGCTAACC-----TCCTACCCCCCTTA-TAGCCGTGGCAAGG 1489 Podosordaria_mexicana AGCGGCGTTGCTAACTGCGGACTTTTTACCTCTTACCAGCCGTGGCAAGG 1498 Whalleya_microplaca ------TTGCTGA------CTTCATACT--C---TAGCCGTGGTAAGG 1540 Creosphaeria_sassafras ------GTGCTAA------TTTGAATCT--TG--TAGCCG-GGCAAGA 1523 Annulohypoxylon_cohaerens ------TTGCTAA------CCCGAAACT--TTTCTAGCCGTGGCAAGA 1420 Daldinia_concentrica ------TTGCTAA------CTTGAATTC--TC--TAGCCGTGGCAAGG 1408 Hypoxylon_investiens ------TTGCTAA------CTTGGTTTG--TC--TAGCCGTGGGAAGG 1440 Theissenia_cinerea ------TGCTAA------CCTCAATCA----ACTAGCCGTGGCAAGG 1480 Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea ------CCGAGGCAAGA 1553

Entoleuca_mammta TCTCTATGAAGGAAGTTGAGGACCAGATGCGAAATGTCCAGAACAAGAAC 1566 Rosellinia_merrillii TCTCCATGAAGGAGGTTGAAGACCAGATGCGAAATGTCCAGAACAAGAAC 1588 Nemania_macrocarpa TTTCCATGAAGGAGGTTGAGGACCAGATGCGAAACGTCCAGAACAAGAAC 1553 Nemania_illita TTTCCATGAAGGAGGTTGAGGACCAGATGCGAAATGTCCAGAACAAGAAC 1522 Amphirosellinia_fushanensis TTTCCATGAAGGAAGTTGAGGACCAGATGCGAAACGTCCAGAACAAGAAC 1560 Astrocystis_bambusae TTTCCATGAAGGAAGTCGAGGACCAGATGCGCAACGTCCAGAACAAGAAC 1547 Kretzschmaria_guyanensis TCTCCATGAAGGAGGTTGAGGACCAGATGCGAAATGTTCAAAACAAAAAC 1548 Kretzschmaria_lucidula TCTCGATGAAGGAGGTTGAGGACCAGATGCGCAATGTTCAAAACAAGAAC 1540 Penzigia_cantareirensis TCTCTATGAAGGAGGTTGAGGACCAGATGCGAAATGTCCAGAACAAGAAC 1537 Stilbohypoxylon_elaeicola TTTCCATGAAGGAGGTCGAGGATCAGATGAGAAATGTTCAGAACAAGAAC 1611 Discoxylaria_myrmecophila TGTCCATGAAGGAAGTTGAGGACCAGATGCGGAACGTCCAGAACAAGAAC 1539 Podosordaria_mexicana TTTCTATGAAGGAAGTTGAGGACCAGATGCGCAACGTCCAGAACAAGAAC 1548 Whalleya_microplaca TTTCGATGAAGGAAGTCGAAGACCAGATGCGCAATGTCCAGAACAAGAAC 1590 Creosphaeria_sassafras TCTCCATGATTGAGGTTGAGGACCATA---ACAA----CAAGCCAAGGGT 1566 Annulohypoxylon_cohaerens TCTCCATGAAGGAGGTCGAGGACCAGATGCGCAACGTCCAGAACAAGAAC 1470 Daldinia_concentrica TCTCAATGAAGGAAGTTGAAGACCAGATGCGCAACGTC------1446 Hypoxylon_investiens TCTCCATGAAGGAGGTTGAAGACCAGATGCGCAATGTCCAGAACAAGAAC 1490 Theissenia_cinerea TCTCGATGAAGGAGGTCGAGGACCAGATGCGCAACGTTCAGAACAAGAAC 1530 Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea TCGCCATGAAGGAGGTTGAGGACCAGATGCGCAACGTCCAGAACCGCAAC 1603

Entoleuca_mammta TCGTCGTACTTCGTTGAGTGGATTCCCAACAACATCCAGA-- 1606 Rosellinia_merrillii TCGTCGTACTTCGTTGAGTGGATTCCCAACAACATCCAGA-- 1628 Nemania_macrocarpa TCGTCTTACTTCGTTGAGTGGATTCCCAACAACATCCAGA-- 1593 Nemania_illita TCGTCTTACTTCGTCGAGTGGATTCCCAACAACATCCAGA-- 1562 Amphirosellinia_fushanensis TCGTCGTACTTTGTCGAATGGATTCCCAACAACATCCAGA-- 1600 Astrocystis_bambusae TCTTCATACTTCGTCGAGTGGATTCCCAACAACATCCAGA-- 1587 Kretzschmaria_guyanensis TCATCTTACTTCGTCGAGTGGATTCCCAACAACATCCAGA-- 1588 Kretzschmaria_lucidula TCAGCGTACTTCGTTGAGTGGATTCCCAACAACATCCAGA-- 1580 Penzigia_cantareirensis TCATCATACTTTGTTGAGTGGATTCCCAACAACATCCAGA-- 1577 Stilbohypoxylon_elaeicola TCTTCGTACTTCGTTGAGTGGATTCCCAACAACATCCAGA-- 1651 Discoxylaria_myrmecophila TCGACCTACTTCGTTGAGTGGATTCCCAACAACATCCAGA-- 1579 Podosordaria_mexicana TCGTCATACTTCGTTGAGTGGATTCCCAACAACATCCAGA-- 1588 Whalleya_microplaca TCGACCTACTTCGTTGAGTGGATTCCCAACAACATCCAGA-- 1630 Creosphaeria_sassafras CACAGCTAC------CCAA----GTTCTG--- 1585 Annulohypoxylon_cohaerens TCGTCTTACTTCG------TCGAGTGGATTCC---- 1496 89

Daldinia_concentrica ------Hypoxylon_investiens TCGTCCTACTTCG------TTGAGTGGATTCCCAAA 1520 Theissenia_cinerea TCGTCCTACTTCGTCGAATGGATTCCCAACAACATCCAGA-- 1570 Mn_majus027 ------Mn_majus049 ------Xylaria_escharoidea TCCTCGTACTTCGTTGAATGGATTCCCAACAACATCCAGA-- 1643

90

Appendix 2.3 RPB2 alignment of M. nivale and M. majus sequences with HindIII digest sites indicated by shading.

99064M_W_NA -CGTTACACATTTGCGTCTACACTGTCCCATTTGCGCAGAACAAACACTCCCGTCGGCAG 59 99049M_W_NA -CGTTACACATTTGCGTCTACACTGTCCCATTTGCGCAGAACAAACACTCCCGTCGGCAG 59 10149M_W_EU -CGTTACACATTTGCGTCTACACTGTCTCATTTGCGCAGAACAAACACTCCCGTCGGCAG 59 10099M_W_EU -CGTTACACATTTGCGTCTACACTGTCTCATTTGCGCAGAACAAACACTCCCGTCGGCAG 59 10098M_W_EU -CGTTACACATTTGCGTCTACACTGTCTCATTTGCGCAGAACAAACACTCCCGTCGGCAG 59 99061M_W_NA -CGTTACACATTTGCGTCTACACTGTCCCATTTGCGCAGAACAAACACTCCCGTCGGCAG 59 96103N_T_NA -CGTTACACTTTTGCGTCTACACTGTCCCATTTGCGCAGAACAAATACTCCTGTCGGTAG 59 96101N_T_NA -CGTTACAGTTTTGCGTCTACACTGTCCCATTTGCGCAGAACAAATACTCCTGTCGGTAG 59 96107N_T_NA -CGTTACACTTTTGCGTCTACACTGTCCCATTTGCGCAGAACAAATACTCCTGTCGGTAG 59 10083N_T_EU -CGTTACACTTTTGCGTCTACACTGTCCCATTTGCGCAGAACAAATACTCCTGTCGGTAG 59 10082N_T_EU -CGTTACACTTTTGCGTCTACACTGTCCCATTTGCGCAGAACAAATACTCCTGTCGGTAG 59 10101N_T_EU -CGTTACACTTTTGCGTCTACACTGTCCCATTTGCGCAGAACAAATACTCCTGTCGGTAG 59 99069N_W_NA -CGTTACACATTTGCGTCTACACTGTCCCATTTGCGCAGAACAAATACTCCTGTCGGTAG 59 99006N_W_NA -CGTTACACATTTGCGTCTACACTGTCCCATTTGCGCAGAACAAATACTCCTGTCGGTAG 59 99084N_W_NA -CGTTACACATTTGCGTCTACACTGTCCCATTTGCGCAGAACAAATACTCCTGTCGGTAG 59 10107N_W_EU -CGTTACACATTTGCGTCTACACTGTCCCATTTGCGCAGAACAAATACTCCTGTCGGTAG 59 10106N_W_EU -CGTTACACATTTGCGTCTACACTGTCCCATTTGCGCAGAACAAATACTCCTGTCGGTAG 59 10152N_W_EU -CGTTACACATTTGCGTCTACACTGTCCCATTTGCGCAGAACAAATACTCCTGTCGGTAG 59 07019MB TCGG-ACACCTTTGCGTCTACCTTGTC-CACTTGCGCAGAACAAACACTCCCGTCGGCCG 58 ** *** *********** **** ** ************** ***** ***** *

HindIII digest site: AGCTT 99064M_W_NA AGATGGTAAACTTGCCAAACCTCGCCAATTGCACAACACACATTGGGGTCTAGTCTGTCC 119 99049M_W_NA AGATGGTAAACTTGCCAAACCTCGCCAATTGCACAACACACATTGGGGTCTAGTCTGTCC 119 10149M_W_EU AGATGGTCAAATTGCCATACCTCGCCTATTGCACAACACACATTGGGGTCTAGTCTGTCC 119 10099M_W_EU AGATGGTCAAATTGCCATACCTCGCCTATTGCACAACACACATTGGGGTCTAGTCTGTCC 119 10098M_W_EU AGATGGTCAAATTGCCATACCTCGCCTATTGCACAACACACATTGGGGTCTAGTCTGTCC 119 99061M_W_NA AGATGGTAAACTTGCCAAACCTCGCCAATTGCACAACACACATTGGGGTCTAGTCTGTCC 119 96103N_T_NA AGATGGTAAGCTTGCCAAACCTCGCCAATTGCACAACACGCATTGGGGTCTTGTCTGTCC 119 96101N_T_NA AGATGGTAAGCTTGCCAAACCTCGCCAATTGCACAACACGCATTGGGGTCTTGTCTGTCC 119 96107N_T_NA AGATGGTAAGCTTGCCAAACCTCGCCAATTGCACAACACGCATTGGGGTCTTGTCTGTCC 119 10083N_T_EU AGATGGTAAGCTTGCCAAACCTCGCCAATTGCACAACACGCATTGGGGTCTTGTCTGTCC 119 10082N_T_EU AGATGGTAAGCTTGCCAAACCTCGCCAATTGCACAACACGCATTGGGGTCTTGTCTGTCC 119 10101N_T_EU AGATGGTAAGCTTGCCAAACCTCGCCAATTGCACAACACGCATTGGGGTCTTGTCTGTCC 119 99069N_W_NA AGATGGTAAGCTTGCCAAACCTCGCCAATTGCACAACACGCATTGGGGTCTTGTCTGTCC 119 99006N_W_NA AGATGGTAAGCTTGCCAAACCTCGCCAATTGCACAACATGCATTGGGGTCTTGTCTGTCC 119 99084N_W_NA AGATGGTAAGCTTGCCAAACCTCGCCAATTGCACAACACGCATTGGGGTCTTGTCTGTCC 119 10107N_W_EU AGATGGTAAGCTTGCCAAACCTCGCCAATTGCACAACACGCATTGGGGTCTTGTCTGTCC 119 10106N_W_EU AGATGGTAAGCTTGCCAAACCTCGCCAATTGCACAACACGCATTGGGGTCTTGTCTGTCC 119 10152N_W_EU AGATGGTAAGCTTGCCAAACCTCGCCAATTGCACAACACGCATTGGGGTCTTGTCTGTCC 119 07019MB AGATGGCAAGCTTGCCAAACCTCGCCAGCTGCACAACACGCACTGGGGTCTTGTCTGTCC 118 ****** * ****** ******** ********* ** ******** ********

99064M_W_NA TGCAGAGACTCCCGAGGGTCAGGCCTGCGGTCTCGTCAAGAACCTTTCTCTCATGTGCTC 179 99049M_W_NA TGCAGAGACTCCCGAGGGTCAGGCCTGCGGTCTCGTCAAGAACCTTTCTCTCATGTGCTC 179 10149M_W_EU TGCAGAGGCTCCCGAGGATCAGGCCTGCGGGCTCGTCGAGAACCTTTCTCTCATGTGCTC 179 10099M_W_EU TGCAGAGGCTCCCGAGGATCAGGCCTGCGGGCTCGTCGAGAACCTTTCTCTCATGTGCTC 179 10098M_W_EU TGCAGAGGCTCCCGAGGATCAGGCCTGCGGGCTCGTCGAGAACCTTTCTCTCATGTGCTC 179 99061M_W_NA TGCAGAGACTCCCGAGGGTCAGGCCTGCGGTCTCGTCAAGAACCTTTCTCTCATGTGCTC 179 96103N_T_NA TGCGGAGACACCCGAGGGTCAGGCCTGCGGTTTGGTCAAGAACCTTTCTCTTATGTGCTC 179 96101N_T_NA TGCGGAGACACCCGAGGGTCAGGCCTGCGGTTTGGTCAAGAACCTTTCTCTTATGTGCTC 179 96107N_T_NA TGCGGAGACACCCGAGGGTCAGGCCTGCGGTTTGGTCAAGAACCTTTCTCTTATGTGCTC 179 10083N_T_EU TGCGGAGACACCCGAGGGTCAGGCCTGCGGTTTGGTCAAGAACCTTTCTCTTATGTGCTC 179 10082N_T_EU TGCGGAGACACCCGAGGGTCAGGCCTGCGGTTTGGTCAAGAACCTTTCTCTTATGTGCTC 179 10101N_T_EU TGCGGAGACACCCGAGGGTCAGGCCTGCGGTTTGGTCAAGAACCTTTCTCTTATGTGCTC 179 99069N_W_NA TGCGGAGACACCCGAGGGTCAGGCCTGCGGTTTGGTCAAGAACCTTTCTCTTATGTGCTC 179 99006N_W_NA TGCGGAGACACCCGAGGGTCAGGCCTGCGGTTTGGTCAAGAACCTTTCTCTTATGTGCTC 179 99084N_W_NA TGCGGAGACACCCGAGGGTCAGGCCTGCGGTTTGGTCAAGAACCTTTCTCTTATGTGCTC 179 10107N_W_EU TGCGGAGACACCCGAGGGTCAGGCCTGCGGTTTGGTCAAGAACCTTTCTCTTATGTGCTC 179 10106N_W_EU TGCGGAGACACCCGAGGGTCAGGCCTGCGGTTTGGTCAAGAACCTTTCTCTTATGTGCTC 179 10152N_W_EU TGCGGAGACACCCGAGGGTCAGGCCTGCGGTTTGGTCAAGAACCTTTCTCTTATGTGCTC 179 07019MB GGCCGAGACGCCCGAAGGGTGGGCTTGTGGTCTGGGCAAGAACCTTTCTCTCATGTGCTC 178 ** *** * ***** * *** ** ** * * * ************* ******** 91

99064M_W_NA GATAAGCGTGGGAACATCGACCGAGCCCATCATCGATTACATGATCACCAGGAACATGGA 239 99049M_W_NA GATAAGCGTGGGAACATCGACCGAGCCCATCATCGATTACATGATCACCAGGAACATGGA 239 10149M_W_EU GATAAGCGTGGGAATCTCGACCGAGCCCATCATCGATTACATGATCACCAGGGACATGCA 239 10099M_W_EU GATAAGCGTGGGAATCTCGACCGAGCCCATCATCGATTACATGATCACCAGGGACATGCA 239 10098M_W_EU GATAAGCGTGGGAATCTCGACCGAGCCCATCATCGATTACATGATCACCAGGGACATGCA 239 99061M_W_NA GATAAGCGTGGGAACATCGACCGAGCCCATCATCGATTACATGATCACCAGGAACATGGA 239 96103N_T_NA GATCAGCGTGGGCACCTCAACTGAGCCCATCATCGATTATATGATTACCAGGAACATGGA 239 96101N_T_NA GATCAGCGTGGGCACCTCAACTGAGCCCATCATCGATTATATGATTACCAGGAACATGGA 239 96107N_T_NA GATCAGCGTGGGCACCTCAACTGAGCCCATCATCGATTATATGATTACCAGGAACATGGA 239 10083N_T_EU GATCAGCGTGGGCACCTCAACTGAGCCCATCATCGATTATATGATTACCAGGAACATGGA 239 10082N_T_EU GATCAGCGTGGGCACCTCAACTGAGCCCATCATCGATTATATGATTACCAGGAACATGGA 239 10101N_T_EU GATCAGCGTGGGCACCTCAACTGAGCCCATCATCGATTATATGATTACCAGGAACATGGA 239 99069N_W_NA GATCAGCGTGGGCACCTCAACTGAGCCCATCATCGATTATATGATTACCAGGAACATGGA 239 99006N_W_NA GATCAGCGTGGGCACCTCAACTGAGCCCATCATCGATTATATGATTACCAGGAACATGGA 239 99084N_W_NA GATCAGCGTGGGCACCTCAACTGAGCCCATCATCGATTATATGATTACCAGGAACATGGA 239 10107N_W_EU GATCAGCGTGGGCACCTCAACTGAGCCCATCATCGATTATATGATTACCAGGAACATGGA 239 10106N_W_EU GATCAGCGTGGGCACCTCAACTGAGCCCATCATCGATTATATGATTACCAGGAACATGGA 239 10152N_W_EU GATCAGCGTGGGCACCTCAACTGAGCCCATCATCGATTATATGATTACCAGGAACATGGA 239 07019MB AATCAGCGTGGGAACCTCAACGGAACCCATTATCGACTACATGATCACGAGGAACATGGA 238 ** ******** * ** ** ** ***** ***** ** ***** ** *** ***** *

99064M_W_NA GGTGTTGGAGGAGTATGAGCCACTGCGATATCCCAATGCCACCAAGATTTTCCTTAACGG 299 99049M_W_NA GGTGTTGGAGGAGTATGAGCCACTGCGATATCCCAATGCCACCAAGATTTTCCTTAACGG 299 10149M_W_EU GGTGTTGGAGGAGAATGAGCCACTGCGATATCCCAATGCCACCAAGATTTTCCTTAACGG 299 10099M_W_EU GGTGTTGGAGGAGAATGAGCCACTGCGATATCCCAATGCCACCAAGATTTTCCTTAACGG 299 10098M_W_EU GGTGTTGGAGGAGAATGAGCCACTGCGATATCCCAATGCCACCAAGATTTTCCTTAACGG 299 99061M_W_NA GGTGTTGGAGGAGAATGAGCCACTGCGATATCCCAATGCCACCAAGATTTTCCTTAACGG 299 96103N_T_NA GGTGTTGGAGGAGTATGAACCATTGCGATATCCCAATGCCACGAAGATTTTCCTCAACGG 299 96101N_T_NA GGTGTTGGAGGAGTATGAACCATTGCGATATCCCAATGCCACGAAGATTTTCCTCAACGG 299 96107N_T_NA GGTGTTGGAGGAGTATGAACCATTGCGATATCCCAATGCCACGAAGATTTTCCTCAACGG 299 10083N_T_EU GGTGTTGGAGGAGTATGAACCATTGCGATATCCCAATGCCACGAAGATTTTCCTCAACGG 299 10082N_T_EU GGTGTTGGAGGAGTATGAACCATTGCGATATCCCAATGCCACGAAGATTTTCCTCAACGG 299 10101N_T_EU GGTGTTGGAGGAGTATGAACCATTGCGATATCCCAATGCCACGAAGATTTTCCTCAACGG 299 99069N_W_NA GGTGTTGGAGGAGTATGAACCATTGCGATATCCCAATGCCACGAAGATTTTCCTCAACGG 299 99006N_W_NA GGTGTTGGAGGAGTATGAACCATTGCGATATCCCAATGCCACGAAGATTTTCCTCAACGG 299 99084N_W_NA GGTGTTGGAGGAGTATGAACCATTGCGATATCCCAATGCCACGAAGATTTTCCTCAACGG 299 10107N_W_EU GGTGTTGGAGGAGTATGAACCATTGCGATATCCCAATGCCACGAAGATTTTCCTCAACGG 299 10106N_W_EU GGTGTTGGAGGAGTATGAACCATTGCGATATCCCAATGCCACGAAGATTTTCCTCAACGG 299 10152N_W_EU GGTGTTGGAGGAGTATGAACCATTGCGATATCCCAATGCCACGAAGATTTTCCTCAACGG 299 07019MB GGTGCTTGAGGAGTACGAGCCACTACGATACCCGAACGCAACAAAGATCTTCCTGAACGG 298 **** * ****** * ** *** * ***** ** ** ** ** ***** ***** *****

HindIII digest site: AGCTT 99064M_W_NA ATCTTGGATTGGTGTGCACCAGGACCCCAAGACGTTGGTGCGGGACGTCCAGCAGCTTCG 359 99049M_W_NA ATCTTGGATTGGTGTGCACCAGGACCCCAAGACGTTGGTGCGGGACGTCCAGCAGCTTCG 359 10149M_W_EU ATCTTGGATTGGTGTGCACCAGGACCCCAAGACGTTGGTGCGGGACGTCCAGCAGCTTCG 359 10099M_W_EU ATCTTGGATTGGTGTGCACCAGGACCCCAAGACGTTGGTGCGGGACGTCCAGCAGCTTCG 359 10098M_W_EU ATCTTGGATTGGTGTGCACCAGGACCCCAAGACGTTGGTGCGGGACGTCCAGCAGCTTCG 359 99061M_W_NA ATCTTGGATTGGTGTGCACCAGGACCCCAAGACGTTGGTGCGGGACGTCCAGCAGCTTCG 359 96103N_T_NA CTCTTGGATCGGTGTGCACCAGGACCCCAAGACGTTGGTGCGAGACGTCCAGCAGCTTCG 359 96101N_T_NA CTCTTGGATCGGTGTGCACCAGGACCCCAAGACGTTGGTGCGAGACGTCCAGCAGCTTCG 359 96107N_T_NA CTCTTGGATCGGTGTGCACCAGGACCCCAAGACGTTGGTGCGAGACGTCCAGCAGCTTCG 359 10083N_T_EU CTCTTGGATCGGTGTGCACCAGGACCCCAAGACGTTGGTGCGAGACGTCCAGCAGCTTCG 359 10082N_T_EU CTCTTGGATCGGTGTGCACCAGGACCCCAAGACGTTGGTGCGAGACGTCCAGCAGCTTCG 359 10101N_T_EU CTCTTGGATCGGTGTGCACCAGGACCCCAAGACGTTGGTGCGAGACGTCCAGCAGCTTCG 359 99069N_W_NA CTCTTGGATCGGTGTGCACCAGGACCCCAAGACGTTGGTGCGAGACGTCCAGCAGCTTCG 359 99006N_W_NA CTCTTGGATCGGTGTGCACCAGGACCCCAAGACGTTGGTGCGAGACGTCCAGCAGCTTCG 359 99084N_W_NA CTCTTGGATCGGTGTGCACCAGGACCCCAAGACGTTGGTGCGAGACGTCCAGCAGCTTCG 359 10107N_W_EU CTCTTGGATCGGTGTGCACCAGGACCCCAAGACGTTGGTGCGAGACGTCCAGCAGCTTCG 359 10106N_W_EU CTCTTGGATCGGTGTGCACCAGGACCCCAAGACGTTGGTGCGAGACGTCCAGCAGCTTCG 359 10152N_W_EU CTCTTGGATCGGTGTGCACCAGGACCCCAAGACGTTGGTGCGAGACGTCCAGCAGCTTCG 359 07019MB ATCGTGGATTGGCGTGCACCAGGATCCCAAGACGTTGGTGCGCGACGTCCAGCAACTTCG 358 ** ***** ** *********** ***************** *********** *****

99064M_W_NA TCGCAACAACCAAATCCCTGCAGAGGTCTCTCTGATTCGTGATATCAGAGACCGTGAATT 419 99049M_W_NA TCGCAACAACCAAATCCCTGCAGAGGTCTCTCTGATTCGTGATATCAGAGACCGTGAATT 419 10149M_W_EU TCGCAACAACCAAATCCCTGCAGAGGTCTCTCTGATTCGTGATATCAGAGACCGTGAATT 419 10099M_W_EU TCGCAACAACCAAATCCCTGCAGAGGTCTCTCTGATTCGTGATATCAGAGACCGTGAATT 419 92

10098M_W_EU TCGCAACAACCAAATCCCTGCAGAGGTCTCTCTGATTCGTGATATCAGAGACCGTGAATT 419 99061M_W_NA TCGCAACAACCAAATCCCTGCAGAGGTCTCTCTGATTCGTGATATCAGAGACCGTGAATT 419 96103N_T_NA TCGCAACAACCAAATCCCTGCAGAGGTTTCTCTGATTCGTGATATCAGAGACCGCGAGTT 419 96101N_T_NA TCGCAACAACCAAATCCCTGCAGAGGTTTCTCTGATTCGTGATATCAGAGACCGCGAGTT 419 96107N_T_NA TCGCAACAACCAAATCCCTGCAGAGGTTTCTCTGATTCGTGATATCAGAGACCGCGAGTT 419 10083N_T_EU TCGCAACAACCAAATCCCTGCAGAGGTTTCTCTGATTCGTGATATCAGAGACCGCGAGTT 419 10082N_T_EU TCGCAACAACCAAATCCCTGCAGAGGTTTCTCTGATTCGTGATATCAGAGACCGCGAGTT 419 10101N_T_EU TCGCAACAACCAAATCCCTGCAGAGGTTTCTCTGATTCGTGATATCAGAGACCGCGAGTT 419 99069N_W_NA TCGCAACAACCAAATCCCTGCAGAGGTTTCTCTGATTCGTGATATCAGAGACCGCGAGTT 419 99006N_W_NA TCGCAACAACCAAATCCCTGCAGAGGTTTCTCTGATTCGTGATATCAGAGACCGCGAGTT 419 99084N_W_NA TCGCAACAACCAAATCCCTGCAGAGGTTTCTCTGATTCGTGATATCAGAGACCGCGAGTT 419 10107N_W_EU TCGCAACAACCAAATCCCTGCAGAGGTTTCTCTGATTCGTGATATCAGAGACCGCGAGTT 419 10106N_W_EU TCGCAACAACCAAATCCCTGCAGAGGTTTCTCTGATTCGTGATATCAGAGACCGCGAGTT 419 10152N_W_EU TCGCAACAACCAAATCCCTGCAGAGGTTTCTCTGATTCGTGATATCAGAGACCGCGAGTT 419 07019MB TCGCAACAATCAGATTCCTGCAGAGGTTTCGCTCATTCGTGACATCAGAGACCGTGAGTT 418 ********* ** ** *********** ** ** ******** *********** ** **

99064M_W_NA CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGCCCACTACTCGTAGTGGAGCAGGAAGC 479 99049M_W_NA CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGCCCACTACTCGTAGTGGAGCAGGAAGC 479 10149M_W_EU CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGCCCACTACTCGTAGTGGAGCAGGAAGC 479 10099M_W_EU CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGCCCACTACTCGTAGTGGAGCAGGAAGC 479 10098M_W_EU CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGCCCACTACTCGTAGTGGAGCAGGAAGC 479 99061M_W_NA CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGCCCACTACTCGTAGTGGAGCAGGAAGC 479 96103N_T_NA CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGTCCACTACTCGTGGTGGAGCAGGAAGC 479 96101N_T_NA CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGTCCACTACTCGTGGTGGAGCAGGAAGC 479 96107N_T_NA CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGTCCACTACTCGTGGTGGAGCAGGAAGC 479 10083N_T_EU CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGTCCACTACTCGTGGTGGAGCAGGAAGC 479 10082N_T_EU CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGTCCACTACTCGTGGTGGAGCAGGAAGC 479 10101N_T_EU CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGTCCACTACTCGTGGTGGAGCAGGAAGC 479 99069N_W_NA CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGTCCACTACTCGTGGTGGAGCAGGAGGC 479 99006N_W_NA CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGTCCACTACTCGTGGTGGAGCAGGAGGC 479 99084N_W_NA CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGTCCACTACTCGTGGTGGAGCAGGAGGC 479 10107N_W_EU CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGTCCACTACTCGTGGTGGAGCAGGAGGC 479 10106N_W_EU CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGTCCACTACTCGTGGTGGAGCAGGAGGC 479 10152N_W_EU CAAGATCTTTTCCGACGCTGGCCGTGTCATGCGTCCACTACTCGTGGTGGAGCAGGAGGC 479 07019MB CAAGATATTTTCTGACGCCGGTCGTGTCATGCGCCCCCTGTTTGTGGTGGAGCAAGAGGA 478 ****** ***** ***** ** *********** ** ** * ** ******** ** *

HindIII digest site: A- 99064M_W_NA CAATCCGGAAACGGGTATAGAAATGGGCTCGTTGACACTCAACAAGGAGCACATCAGGAA 539 99049M_W_NA CAATCCGGAAACGGGTATAGAAATGGGCTCGTTGACACTCAACAAGGAGCACATCAGGAA 539 10149M_W_EU CAATCCGGAAACGGGTATAGAAATGGGCTCGTTGACACTCAACAAGGAGCACATCAGGAA 539 10099M_W_EU CAATCCGGAAACGGGTATAGAAATGGGCTCGTTGACACTCAACAAGGAGCACATCAGGAA 539 10098M_W_EU CAATCCGGAAACGGGTATAGAAATGGGCTCGTTGACACTCAACAAGGAGCACATCAGGAA 539 99061M_W_NA CAATCCGGAAACGGGTATAGAAATGGGCTCGTTGACACTCAACAAGGAGCACATCAGGAA 539 96103N_T_NA CAATCCGGAGACGGGTGTAGAAATGGGCTCGTTAACACTCAACAAGGAGCACATCAGGAA 539 96101N_T_NA CAATCCGGAGACGGGTGTAGAAATGGGCTCGTTAACACTCAACAAGGAGCACATCAGGAA 539 96107N_T_NA CAATCCGGAGACGGGTGTAGAAATGGGCTCGTTAACACTCAACAAGGAGCACATCAGGAA 539 10083N_T_EU CAATCCGGAGACGGGTGTAGAAATGGGCTCGTTAACACTCAACAAGGAGCACATCAGGAA 539 10082N_T_EU CAATCCGGAGACGGGTGTAGAAATGGGCTCGTTAACACTCAACAAGGAGCACATCAGGAA 539 10101N_T_EU CAATCCGGAGACGGGTGTAGAAATGGGCTCGTTAACACTCAACAAGGAGCACATCAGGAA 539 99069N_W_NA CAATCCGGAGACGGGTGTAGAAATGGGCTCTTTAACACTCAACAAGGAGCACATCAGGAA 539 99006N_W_NA CAATCCGGAGACGGGTGTAGAAATGGGCTCTTTAACACTCAACAAGGAGCACATCAGGAA 539 99084N_W_NA CAATCCGGAGACGGGTGTAGAAATGGGCTCTTTAACACTCAACAAGGAGCACATCAGGAA 539 10107N_W_EU CAATCCGGAGACGGGTGTAGAAATGGGCTCTTTAACACTCAACAAGGAGCACATCAGGAA 539 10106N_W_EU CAATCCGGAGACGGGTGTAGAAATGGGCTCTTTAACACTCAACAAGGAGCACATCAGGAA 539 10152N_W_EU CAATCCGGAGACGGGTGTAGAAATGGGCTCTTTAACACTCAACAAGGAGCACATCAGGAA 539 07019MB CAACCCAGACACTGGGGTGGAGAAAGGTTCCTTGGTGCTCAATAAGGAGCATATCAGGAA 538 *** ** ** ** ** * ** * ** ** ** ***** ******** ******** HindIII digest: -GCTT 99064M_W_NA GCTTGAGAACGATCAAGGCTTGCCCT-CTGGCAGTGACGAGTACTTCGGCTGGCAAGGTT 598 99049M_W_NA GCTTGAGAACGATCAAGGCTTGCCCT-CTGGCAGTGACGAGTACTTCGGCTGGCAAGGTT 598 10149M_W_EU GCTTGAGAACGATCAAGGCTTGCCCT-CTGGCAGTGACGAGTACTTCGGCTGGCAAGGTT 598 10099M_W_EU GCTTGAGAACGATCAAGGCTTGCCCT-CTGGCAGTGACGAGTACTTCGGCTGGCAAGGTT 598 10098M_W_EU GCTTGAGAACGATCAAGGCTTGCCCT-CTGGCAGTGACGAGTACTTCGGCTGGCAAGGTT 598 99061M_W_NA GCTTGAGAACGATCAAGGCTTGCCCT-CTGGCAGTGACGAGTACTTCGGCTGGCAAGGTT 598 96103N_T_NA GCTTGAGAACGATCAAGGCCTACCCT-CTGGTAGTGACGAGTACTTCGGCTGGCAAGGTT 598 96101N_T_NA GCTTGAGAACGATCAAGGCCTACCCT-CTGGTAGTGACGAGTACTTCGGCTGGCAAGGTT 598 96107N_T_NA GCTTGAGAACGATCAAGGCCTACCCT-CTGGTAGTGACGAGTACTTCGGCTGGCAAGGTT 598 93

10083N_T_EU GCTTGAGAACGATCAAGGCCTACCCT-CTGGTAGTGACGAGTACTTCGGCTGGCAAGGTT 598 10082N_T_EU GCTTGAGAACGATCAAGGCCTACCCT-CTGGTAGTGACGAGTACTTCGGCTGGCAAGGTT 598 10101N_T_EU GCTTGAGAACGATCAAGGCCTACCCT-CTGGTAGTGACGAGTACTTCGGCTGGCAAGGTT 598 99069N_W_NA GCTTGAGAACGATCAAGGCCTACCCT-CTGGTAGTGACGAGTACTTCGGCTGGCAAGGTT 598 99006N_W_NA GCTTGAGAACGATCAAGGCCTACCCT-CTGGTAGTGACGAGTACTTCGGCTGGCAAGGTT 598 99084N_W_NA GCTTGAGAACGATCAAGGCCTACCCT-CTGGTAGTGACGAGTACTTCGGCTGGCAAGGTT 598 10107N_W_EU GCTTGAGAACGATCAAGGCCTACCCT-CTGGTAGTGACGAGTACTTCGGCTGGCAAGGTT 598 10106N_W_EU GCTTGAGAACGATCAAGGCCTACCCT-CTGGTAGTGACGAGTACTTCGGCTGGCAAGGTT 598 10152N_W_EU GCTTGAGAACGATCAAGGCCTACCCT-CTGGTAGTGACGAGTACTTCGGCTGGCAAGGTT 598 07019MB GCTCGAGAACGACCA-GGCTCACGGGGCTGGGAGCGAGGAGTACTTTGGCTGGCAAG--- 594 *** ******** ** *** * **** ** ** ******** **********

99064M_W_NA TGGTCAACGAGGGTGTCATCGAGTATCTTGATGCTGAGGAAGAGGAAACCTCAATGATTT 658 99049M_W_NA TGGGCAACGAGGGTGTCATCGAGTATCTTGATGCTGAGGAAGAAGAAACCTCAATGATTT 658 10149M_W_EU TGGTCAACGAGGGTGTCATCGAGTATCTTGATGCTGAGGAAGAGGAAACCTCAATGATTT 658 10099M_W_EU TGGTCAACGAGGGTGTCATCGAGTATCTTGATGCTGAGGAAGAGGAAACCTCAATGATTT 658 10098M_W_EU TGGTCAACGAGGGTGTCATCGAGTATCTTGATGCTGAGGAAGAGGAAACCTCAATGATTT 658 99061M_W_NA TGGTCAACGAGGGTGTCATCGAGTATCTTGATGCTGAGGAAGAGGAAACCTCAATGATTT 658 96103N_T_NA TGGTCAACGAGGGTGTCATCGAGTATCTCGACGCCGAGGAAGAGGAAACTTCAATGATTT 658 96101N_T_NA TGGTCAACGAGGGTGTCATCGAGTATCTCGACGCCGAGGAAGAGGAAACTTCAATGATTT 658 96107N_T_NA TGGTCAACGAGGGTGTCATCGAGTATCTCGACGCCGAGGAAGAGGAAACTTCAATGATTT 658 10083N_T_EU TGGTCAACGAGGGTGTCATCGAGTATCTCGACGCCGAGGAAGAGGAAACTTCAATGATTT 658 10082N_T_EU TGGTCAACGAGGGTGTCATCGAGTATCTCGACGCCGAGGAAGAGGAAACTTCAATGATTT 658 10101N_T_EU TGGTCAACGAGGGTGTCATCGAGTATCTCGACGCCGAGGAAGAGGAAACTTCAATGATTT 658 99069N_W_NA TGGTCAACGAGGGTGTCATCGAGTATCTCGACGCCGAGGAAGAGGAAACTTCAATGATTT 658 99006N_W_NA TGGTCAACGAGGGTGTCATCGAGTATCTCGACGCCGAGGAAGAGGAAACTTCAATGATTT 658 99084N_W_NA TGGTCAACGAGGGTGTCATCGAGTATCTCGACGCCGAGGAAGAGGAAACTTCAATGATTT 658 10107N_W_EU TGGTCAACGAGGGTGTCATCGAGTATCTCGACGCCGAGGAAGAGGAAACTTCAATGATTT 658 10106N_W_EU TGGTCAACGAGGGTGTCATCGAGTATCTCGACGCCGAGGAAGAGGAAACTTCAATGATTT 658 10152N_W_EU TGGTCAACGAGGGTGTCATCGAGTATCTCGACGCCGAGGAAGAGGAAACTTCAATGATTT 658 07019MB ------

99064M_W_NA GCATGACCGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGCCACGACATGACAA 718 99049M_W_NA GCATGACCGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGCCCCGACATGACAA 718 10149M_W_EU GCATGACCGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGCCACGACATGACAA 718 10099M_W_EU GCATGACCGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGCCACGACATGACAA 718 10098M_W_EU GCATGACCGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGCCACGACATGACAA 718 99061M_W_NA GCATGACCGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGCCACGACATGACAA 718 96103N_T_NA GTATGACTGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGTCACGACATGACAA 718 96101N_T_NA GTATGACTGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGTCACGACATGACAA 718 96107N_T_NA GTATGACTGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGTCACGACATGACAA 718 10083N_T_EU GTATGACTGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGTCACGACATGACAA 718 10082N_T_EU GTATGACTGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGTCACGACATGACAA 718 10101N_T_EU GTATGACTGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGTCACGACATGACAA 718 99069N_W_NA GTATGACCGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGTCACGACATGACAA 718 99006N_W_NA GTATGACCGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGTCACGACATGACAA 718 99084N_W_NA GTATGACCGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGTCACGACATGACAA 718 10107N_W_EU GTATGACCGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGTCACGACATGACAA 718 10106N_W_EU GTATGACCGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGTCACGACATGACAA 718 10152N_W_EU GTATGACCGCTGAGGATCTTGAGACCTTTCGTTTGGCGAAACAAGGTCACGACATGACAA 718 07019MB ------

99064M_W_NA CGGACAACAGC 729 99049M_W_NA CGGAAAACAGC 729 10149M_W_EU CGGACAACAGC 729 10099M_W_EU CGGACAACAGC 729 10098M_W_EU CGGACAACAGC 729 99061M_W_NA CGGACAACAGC 729 96103N_T_NA CAGACAACACC 729 96101N_T_NA CAGACAACACC 729 96107N_T_NA CAGACAACACC 729 10083N_T_EU CAGACAACACC 729 10082N_T_EU CAGACAACACC 729 10101N_T_EU CAGACAACACC 729 99069N_W_NA CAGACAACACC 729 99006N_W_NA CAGACAACACC 729 99084N_W_NA CAGACAACACC 729 94

10107N_W_EU CAGACAACACC 729 10106N_W_EU CAGACAACACC 729 10152N_W_EU CAGACAACACC 729 07019MB ------

95

Appendix 2.4 Alignment of β-tubulin sequences from M. nivale and M. majus. Primer- binding sites are indicated by shading.

10099M_W_EU AGGCTTCCGGCAACAAGTACGTTCCTCGTGCCGTCCTTGTCGATCTCGAGCCCGGTACCA 60 10098M_W_EU AGGCTTCCGGCAACAAGTACGTTCCTCGTGCCGTCCTTGTCGATCTCGAGCCCGGTACCA 60 10149M_W_EU AGGCTTCCGGCAACAAGTACGTTCCTCGTGCCGTCCTTGTCGATCTCGAGCCCGGTACCA 60 99027M_W_NA AGGCTTCCGGCAACAAGTACGTTCCTCGTGCCGTCCTTGTCGATCTCGAGCCCGGTACCA 60 99049M_W_NA AGGCTTCCGGCAACAAGTACGTTCCTCGTGCCGTCCTTGTCGATCTCGAGCCCGGTACCA 60 99061M_W_NA AGGCTTCCGGCAACAAGTACGTTCCTCGTGCCGTCCTTGTCGATCTCGAGCCCGGTACCA 60 99084N_W_NA AGGCTTCCGGCAACAAGTACGTTCCCCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCA 60 10107N_W_EU AGGCTTCCGGCAACAAGTACGTTCCCCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCA 60 10106N_W_EU AGGCTTCCGGCAACAAGTACGTTCCCCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCA 60 99006N_W_NA AGGCTTCCGGCAACAAGTACGTTCCCCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCA 60 99069N_W_NA AGGCTTCCGGCAACAAGTACGTTCCCCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCA 60 10152N_W_EU AGGCTTCCGGCAACAAGAACGTTCCCCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCA 60 96107N_T_NA AGGCTTCCGGCAACAAGTACGTTCCCCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCA 60 96103N_T_NA AGGCTTCCGGCAACAAGTACGTTCCCCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCA 60 96101N_T_NA AGGCTTCCGGCAACAAGTACGTTCCCCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCA 60 10101N_T_EU AGGCTTCCGGCAACAAGTACGTTCCCCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCA 60 10082N_T_EU AGGCTTCCGGCAACAAGTACGTTCCCCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCA 60 10083N_T_EU AGGCTTCCGGCAACAAGTACGTTCCCCGCGCCGTCCTCGTCGATCTCGAGCCCGGTACCA 60 07019_OUT AGGCCTCCGGAAACAAGTACGTTCCTCGCGCTGTTCTCGTCGATCTTGAGCCCGGTACCA 60 07020_OUT AGGCCTCCGGCAACAAGTACGTTCCTCGCGCTGTTCTCGTCGATCTTGAGCCCGGTACCA 60 **** ***** ****** ******* ** ** ** ** ******** *************

10099M_W_EU TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGCCCCGACAACTTCGTCTTCG 120 10098M_W_EU TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGCCCCGACAACTTCGTCTTCG 120 10149M_W_EU TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGCCCCGACAACTTCGTCTTCG 120 99027M_W_NA TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGCCCCGACAACTTCGTCTTCG 120 99049M_W_NA TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGCCCCGACAACTTCGTCTTCG 120 99061M_W_NA TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGCCCCGACAACTTCGTCTTCG 120 99084N_W_NA TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGTCCCGACAACTTCGTCTTCG 120 10107N_W_EU TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGTCCCGACAACTTCGTCTTCG 120 10106N_W_EU TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGTCCCGACAACTTCGTCTTCG 120 99006N_W_NA TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGTCCCGACAACTTCGTCTTCG 120 99069N_W_NA TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGTCCCGACAACTTCGTCTTCG 120 10152N_W_EU TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGTCCCGACAACTTCGTCTTCG 120 96107N_T_NA TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGTCCCGACAACTTCGTCTTCG 120 96103N_T_NA TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGTCCCGACAACTTCGTCTTCG 120 96101N_T_NA TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGTCCCGACAACTTCGTCTTCG 120 10101N_T_EU TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGTCCCGACAACTTCGTCTTCG 120 10082N_T_EU TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGTCCCGACAACTTCGTCTTCG 120 10083N_T_EU TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTGTTCCGTCCCGACAACTTCGTCTTCG 120 07019_OUT TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTTTTCCGCCCCGACAACTTCGTCTTCG 120 07020_OUT TGGATGCCGTCCGTGCTGGTCCCTTCGGCCAGCTTTTCCGCCCCGACAACTTCGTCTTCG 120 ********************************** ***** *******************

10099M_W_EU GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCACTACACTGAGGGTGCCGAGCTTG 180 10098M_W_EU GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCACTACACTGAGGGTGCCGAGCTTG 180 10149M_W_EU GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCACTACACTGAGGGTGCCGAGCTTG 180 99027M_W_NA GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCACTACACTGAGGGTGCCGAGCTTG 180 99049M_W_NA GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCACTACACTGAGGGTGCCGAGCTTG 180 99061M_W_NA GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCACTACACTGAGGGTGCCGAGCTTG 180 99084N_W_NA GTCAGTCCGGTGCTGGCAACAATTGGGCGAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 10107N_W_EU GTCAGTCCGGTGCTGGCAACAATTGGGCGAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 10106N_W_EU GTCAGTCCGGTGCTGGCAACAATTGGGCGAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 99006N_W_NA GTCAGTCCGGAGCTGGCAACAATTGGACGAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 99069N_W_NA GTCAGTCCGGTGCTGGCAACAATTGGACGAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 10152N_W_EU GTCAGTCCGGTGCTGGCAACAATTGGGCGAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 96107N_T_NA GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 96103N_T_NA GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 96101N_T_NA GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 10101N_T_EU GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 10082N_T_EU GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 10083N_T_EU GTCAGTCCGGTGCTGGCAACAACTGGGCGAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 07019_OUT GTCAGTCCGGTGCCGGCAACAACTGGGCCAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 96

07020_OUT GTCAGTCCGGTGCCGGCAACAACTGGGCCAAGGGTCATTACACTGAGGGTGCCGAGCTTG 180 ********** ** ******** *** * ******** **********************

10099M_W_EU TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGCGACTGCCTTCAGGGTT 240 10098M_W_EU TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGCGACTGCCTTCAGGGTT 240 10149M_W_EU TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGCGACTGCCTTCAGGGTT 240 99027M_W_NA TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGCGACTGCCTTCAGGGTT 240 99049M_W_NA TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGCGACTGCCTTCAGGGTT 240 99061M_W_NA TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGCGACTGCCTTCAGGGTT 240 99084N_W_NA TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGTGACTGCCTTCAGGGTT 240 10107N_W_EU TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGTGACTGCCTTCAGGGTT 240 10106N_W_EU TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGTGACTGCCTTCAGGGTT 240 99006N_W_NA TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGTGACTGCCTTCAGGGTT 240 99069N_W_NA TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGTGACTGCCTTCAGGGTT 240 10152N_W_EU TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGTGACTGCCTTCAGGGTT 240 96107N_T_NA TCGACCAGGTCCTCGAAGTCGTCCGTCGCGAGGCTGAGGGCTGTGACTGCCTTCAGGGTT 240 96103N_T_NA TCGACCAGGTCCTCGAAGTCGTCCGTCGCGAGGCTGAGGGCTGTGACTGCCTTCAGGGTT 240 96101N_T_NA TCGACCAGGTCCTCGAAGTCGTCCGTCGCGAGGCTGAGGGCTGTGACTGCCTTCAGGGTT 240 10101N_T_EU TCGACCAGGTCCTCGAAGTCGTCCGTCGCGAGGCTGAGGGCTGTGACTGCCTTCAGGGTT 240 10082N_T_EU TCGACCAGGTCCTCGAAGTCGTCCGTCGCGAGGCTGAGGGCTGTGACTGCCTTCAGGGTT 240 10083N_T_EU TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGTGACTGCCTTCAGGGTT 240 07019_OUT TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGCGACTGCCTCCAGGGTT 240 07020_OUT TCGACCAGGTCCTCGAGGTCGTCCGTCGCGAGGCTGAGGGCTGCGACTGCCTCCAGGGTT 240 **************** ************************** ******** *******

10099M_W_EU TCCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGTATGGGTACGCTCCTCATCT 300 10098M_W_EU TCCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGTATGGGTACGCTCCTCATCT 300 10149M_W_EU TCCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGTATGGGTACGCTCCTCATCT 300 99027M_W_NA TCCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGTATGGGTACGCTCCTCATCT 300 99049M_W_NA TCCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGTATGGGTACGCTCCTCATCT 300 99061M_W_NA TCCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGTATGGGTACGCTCCTCATCT 300 99084N_W_NA TTCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGAATGGGCACGCTCCTCATCT 300 10107N_W_EU TTCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGAATGGGCACGCTCCTCATCT 300 10106N_W_EU TTCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGAATGGGCACGCTCCTCATCT 300 99006N_W_NA TTCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGAATGGGCACGCTCCTCATCT 300 99069N_W_NA TTCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGAATGGGCACGCTCCTCATCT 300 10152N_W_EU TTCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGAATGGGCACGCTCCTCATCT 300 96107N_T_NA TCCAGATCACCCACTCTCTCGGTGGTGGTACTGGTGCCGGAATGGGCACGCTCCTCATCT 300 96103N_T_NA TCCAGATCACCCACTCTCTCGGTGGTGGTACTGGTGCCGGAATGGGCACGCTCCTCATCT 300 96101N_T_NA TCCAGATCACCCACTCTCTCGGTGGTGGTACTGGTGCCGGAATGGGCACGCTCCTCATCT 300 10101N_T_EU TCCAGATCACCCACTCTCTCGGTGGTGGTACTGGTGCCGGAATGGGCACGCTCCTCATCT 300 10082N_T_EU TCCAGATCACCCACTCTCTCGGTGGTGGTACTGGTGCCGGAATGGGCACGCTCCTCATCT 300 10083N_T_EU TCCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGAATGGGCACGCTCCTCATCT 300 07019_OUT TCCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGTATGGGTACGCTCCTCATCT 300 07020_OUT TCCAGATCACCCACTCTCTCGGTGGTGGTACCGGTGCCGGTATGGGTACGCTCCTCATCT 300 * ***************************** ******** ***** *************

10099M_W_EU CCAAGATCCGCGAGAGGTTCCCTGACCGTATGATGGCCACCTTCTCGGTTGTCCCCTCGC 360 10098M_W_EU CCAAGATCCGCGAGAAGTTCCCTGACCGTATGATGGCCACCTTCTCGGTTGTCCCCTCGC 360 10149M_W_EU CCAAGATCCGCGAGGAGTTCCCTGACCGTATGATGGCCACCTTCTCGGTTGTCCCCTCGC 360 99027M_W_NA CCAAGATCCGCGAGAAGTTCCCTGACCGTATGATGGCCACCTTCTCGGTTGTCCCCTCGC 360 99049M_W_NA CCAAGATCCGCGAGAAGTTCCCTGACCGTATGATGGCCACCTTCTCGGTTGTCCCCTCGC 360 99061M_W_NA CCAAGATCCGCGAGGAGTTCCCTGACCGTATGATGGCCACCTTCTCGGTTGTCCCCTCGC 360 99084N_W_NA CCAAGATCCGTGAGGAGTTCCCCGACCGCATGATGGCCACCTTCTCGGTTGTTCCCTCGC 360 10107N_W_EU CCAAGATCCGTGAGGAGTTCCCCGACCGCATGATGGCCACCTTCTCGGTTGTTCCCTCGC 360 10106N_W_EU CCAAGATCCGTGAGGAGTTCCCCGACCGCATGATGGCCACCTTCTCGGTTGTTCCCTCGC 360 99006N_W_NA CCAAGATCCGTGAGGAGTTCCCCGACCGCATGATGGCCACCTTCTCGGTTGTTCCCTCGC 360 99069N_W_NA CCAAGATCCGTGAGGAGTTCCCCGACCGCATGATGGCCACCTTCTCGGTTGTTCCCTCGC 360 10152N_W_EU CCAAGATCCGTGAGAGGTTCCCCGACCGCATGATGGCCACCTTCTCGGTTGTTCCCTCGC 360 96107N_T_NA CCAAGATCCGTGAGGAGTTCCCCGACCGCATGATGGCCACCTTCTCGGTTGTTCCCTCGC 360 96103N_T_NA CCAAGATCCGTGAGGAGTTCCCCGACCGCATGATGGCCACCTTCTCGGTTGTTCCCTCGC 360 96101N_T_NA CCAAGATCCGTGAGGAGTTCCCCGACCGCATGATGGCCACCTTCTCGGTTGTTCCCTCGC 360 10101N_T_EU CCAAGATCCGTGAGGAGTTCCCCGACCGCATGATGGCCACCTTCTCGGTTGTTCCCTCGC 360 10082N_T_EU CCAAGATCCGTGAGGAGTTCCCCGACCGCATGATGGCCACCTTCTCGGTTGTTCCCTCGC 360 10083N_T_EU CCAAGATCCGTGAGAAGTTCCCCGACCGTATGATGGCCACCTTCTCGGTTGTTCCCTCGC 360 07019_OUT CCAAGATCCGCGAGGAGTTCCCTGATCGCATGATGGCCACTTTCTCGGTCGTCCCCTCGC 360 07020_OUT CCAAGATCCGCGAGGAGTTCCCTGATCGCATGATGGCCACTTTCTCGGTCGTCCCCTCGC 360 ********** *** ****** ** ** *********** ******** ** *******

97

10099M_W_EU CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCCATCCACCAGCTTG 420 10098M_W_EU CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCCATCCACCAGCTTG 420 10149M_W_EU CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCCATCCACCAGCTTG 420 99027M_W_NA CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCCATCCACCAGCTTG 420 99049M_W_NA CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCCATCCACCAGCTTG 420 99061M_W_NA CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCCATCCACCAGCTTG 420 99084N_W_NA CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAATGCCACCCTGTCCATCCACCAGCTTG 420 10107N_W_EU CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAATGCCACCCTGTCCATCCACCAGCTTG 420 10106N_W_EU CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAATGCCACCCTGTCCATCCACCAGCTTG 420 99006N_W_NA CCAAGGTCTCCGACACCGTCGTCGAGCGTTACAATGCCACCCTGTCCATCCACCAGCTTG 420 99069N_W_NA CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAATGCCACCCTGTCCATCCACCAGCTTG 420 10152N_W_EU CCAAGGTCTCCGACACCGTCGTCGAAGCTTACAATGCCACCCTGTCCATCCACCAGCTTG 420 96107N_T_NA CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCTATCCACCAGCTTG 420 96103N_T_NA CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCTATCCACCAGCTTG 420 96101N_T_NA CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCTATCCACCAGCTTG 420 10101N_T_EU CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCTATCCACCAGCTTG 420 10082N_T_EU CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCTATCCACCAGCTTG 420 10083N_T_EU CCAAGGTCTCCGACACCGTCGTCGAGCCTTACAACGCCACCCTGTCTATCCACCAGCTTG 420 07019_OUT CCAAGGTCTCCGATACCGTTGTCGAGCCTTACAACGCCACTCTGTCCATCCACCAGCTTG 420 07020_OUT CCAAGGTCTCCGATACCGTTGTCGAGCCTTACAACGCCACTCTGTCCATCCACCAGCTTG 420 ************* ***** ***** ****** ***** ***** *************

10099M_W_EU TCGAGAACTCGGACGCGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 10098M_W_EU TCGAGAACTCGGACGCGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 10149M_W_EU TCGAGAACTCGGACGCGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 99027M_W_NA TCGAGAACTCGGACGAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 99049M_W_NA TCGAGAACTCGGACGAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 99061M_W_NA TCGAGAACTCGGACGAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 99084N_W_NA TCGAGAACTCCGACGAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 10107N_W_EU TCGAGAACTCCGACGAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 10106N_W_EU TCGAGAACTCCGACGAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 99006N_W_NA TCGAGAACTCCGACGAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 99069N_W_NA TCGAGAACTCCGACGAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 10152N_W_EU TCGAGAACTCCGACGAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 96107N_T_NA TCGAGAACTCCGACAAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 96103N_T_NA TCGAGAACTCCGACAAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 96101N_T_NA TCGAGAACTCCGACAAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 10101N_T_EU TCGAGAACTCCGACAAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 10082N_T_EU TCGAGAACTCCGACGAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 10083N_T_EU TCGAGAACTCCGACGAGACTTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 07019_OUT TCGAGAACTCCGACGCGACCTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 07020_OUT TCGAGAACTCCGACGCGACCTTCTGCATTGACAACGAGGCTCTGTACGACATCTGCATGC 480 ********** *** *** ****************************************

10099M_W_EU GTACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCTGCCGTCA 540 10098M_W_EU GTACCCTGAAGCTCTCCAATCCCTCATACGGCGACCTGAACTACCTCGTCTCTGCCGTCA 540 10149M_W_EU GTACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCTGCCGTCA 540 99027M_W_NA GTACCCTGAAGCTCTCCAACCCCTCATACGGGGACCTGAACTACCTCGTCTCTGCGGTCA 540 99049M_W_NA GTACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCTGCCGTCA 540 99061M_W_NA GTACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCTGCCGTCA 540 99084N_W_NA GCACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCCGCCGTCA 540 10107N_W_EU GCACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCCGCCGTCA 540 10106N_W_EU GCACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCCGCCGTCA 540 99006N_W_NA GCACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCCGCCGTCA 540 99069N_W_NA GCACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCCGCCGTCA 540 10152N_W_EU GCACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCCGCCGTCA 540 96107N_T_NA GCACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCCGCCGTCA 540 96103N_T_NA GCACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCCGCCGTCA 540 96101N_T_NA GCACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCCGCCGTCA 540 10101N_T_EU GCACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCCGCCGTCA 540 10082N_T_EU GCACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCCGCCGTCA 540 10083N_T_EU GCACCCTGAAGCTCTCCAACCCCTCATACGGCGACCTGAACTACCTCGTCTCCGCCGTCA 540 07019_OUT GCACCCTGAAGCTCTCCAACCCCTCGTACGGCGACTTGAACTACCTCGTCTCCGCCGTCA 540 07020_OUT GCACCCTGAAGCTCTCCAACCCCTCGTACGGCGACTTGAACTACCTCGTCTCCGCCGTCA 540 * ***************** ***** ***** *** **************** ** ****

10099M_W_EU TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGC 600 10098M_W_EU TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGT 600 10149M_W_EU TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGT 600 98

99027M_W_NA TGTCGGGTGTCACCACCTGCCTGCGTTTCCCGGGCCAGCTCAACTCTGACCTGCGCAAGC 600 99049M_W_NA TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGCCAGCTCAACTCTGACCTGCGCAAGC 600 99061M_W_NA TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGC 600 99084N_W_NA TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGC 600 10107N_W_EU TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGC 600 10106N_W_EU TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGC 600 99006N_W_NA TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGC 600 99069N_W_NA TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGC 600 10152N_W_EU TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGT 600 96107N_T_NA TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGT 600 96103N_T_NA TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGT 600 96101N_T_NA TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGC 600 10101N_T_EU TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGC 600 10082N_T_EU TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGT 600 10083N_T_EU TGTCCGGTGTCACCACCTGCCTGCGTTTCCCCGGTCAGCTCAACTCTGACCTGCGCAAGT 600 07019_OUT TGTCCGGTGTCACCACTTGCCTGCGTTTCCCTGGTCAGCTCAACTCTGATCTGCGCAAGC 600 07020_OUT TGTCCGGTGTCACCACTTGCCTGCGTTTCCCTGGTCAGCTCAACTCTGATCTGCGCAAGC 600 **** *********** ************** ** ************** *********

10099M_W_EU CCGCCGTCAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGATTCGCTCCCC 660 10098M_W_EU CCGCCGTCAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGATTCGCTCCCC 660 10149M_W_EU TCGCCGTCAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGATTCGCTCCCC 660 99027M_W_NA TCGCCGTCAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGATTCGCTCCCC 660 99049M_W_NA TCGCCGTCAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGATTCGCTCCCC 660 99061M_W_NA TCGCCGTCAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGATTCGCTCCCC 660 99084N_W_NA TCGCCGTTAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGCTTCGCTCCCC 660 10107N_W_EU TCGCCGTTAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGCTTCGCTCCCC 660 10106N_W_EU TCGCCGTTAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATCCTCGGCTTCGCTCCCC 660 99006N_W_NA TCGCCGTTAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGCTTCGCTCCCC 660 99069N_W_NA TCGCCGTTAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGCTTCGCTCCCC 660 10152N_W_EU CCGCCGTTAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGCTTCGCTCCCC 660 96107N_T_NA CCGCCGTTAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGCTTCGCTCCCC 660 96103N_T_NA CCGCCGTTAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGCTTCGCTCCCC 660 96101N_T_NA TCGCCGTTAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGCTTCGCTCCCC 660 10101N_T_EU TCGCCGTTAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGCTTCGCTCCCC 660 10082N_T_EU TCGCCGTTAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGCTTCGCTCCCC 660 10083N_T_EU TCGCCGTTAACATGGTGCCCTTCCCGCGTCTGCATTTCTTCATGGTCGGCTTCGCTCCCC 660 07019_OUT TCGCCGTCAACATGGTGCCATTCCCTCGTCTGCACTTCTTCATGGTCGGCTTCGCTCCCC 660 07020_OUT TCGCCGTCAACATGGTGCCATTCCCTCGTCTGCACTTCTTCATGGTCGGCTTCGCTCCCC 660 ****** *********** ***** ******** ******** **** **********

10099M_W_EU TGACCAGCCGTGGCGCCCACTCTTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 10098M_W_EU TGACCAGCCGTGGCGCCCACTCTTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 10149M_W_EU TGACCAGCCGTGGCGCCCACTCTTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 99027M_W_NA TGACCAGCCGTGGCGCCCACTCTTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 99049M_W_NA TGACCAGCCGTGGCGCCCACTCTTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 99061M_W_NA TGACCAGCCGTGGCGCCCACTCTTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 99084N_W_NA TGACCAGCCGTGGCGCCCACTCCTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 10107N_W_EU TGACCAGCCGTGGCGCCCACTCCTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 10106N_W_EU TGACCAGCCGTGGCGCCCACTCCTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 99006N_W_NA TGACCAGCCGTGGCGCCCACTCCTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 99069N_W_NA TGACCAGCCGTGGCGCCCACTCCTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 10152N_W_EU TGACCAGCCGTGGCGCCCACTCCTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 96107N_T_NA TGACCAGCCGTGGCGCCCACTCCTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 96103N_T_NA TGACCAGCCGTGGCGCCCACTCCTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 96101N_T_NA TGACCAGCCGTGGCGCCCACTCCTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCTGT 720 10101N_T_EU TGACCAGCCGTGGCGCCCACTCCTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 10082N_T_EU TGACCAGCCGTGGCGCCCACTCCTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 10083N_T_EU TGACCAGCCGTGGCGCCCACTCCTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 07019_OUT TGACCAGCCGTGGCGCCCACTCTTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 07020_OUT TGACCAGCCGTGGCGCCCACTCTTTCCGTGCCGTCACCGTCCCCGAGTTGACCCAGCAGA 720 ********************** ********************************** *

99

Appendix 2.5 Alignment of EF-1α sequences from M. nivale and M. majus. Primer binding sitse are indicated by shading.

99027M_W_NA AGACGCTCC-GGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 59 99064M_W_NA AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 10096M_W_EU AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 10099M_W_EU AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 10149M_W_EU AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 99049M_W_NA AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 10083N_T_EU AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 99006N_W_NA AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 96103N_T_NA AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 99084N_W_NA AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 10106N_W_EU AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 10101N_T_EU AGACGCTCC-GGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 59 10107N_W_EU AGACGCTCC-GGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 59 99069N_W_NA AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 10152N_W_EU AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 10082N_T_EU AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 96101N_T_NA AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 96107N_T_NA AGACGCTCCCGGTCACCGTGATTTCATCAAGAACATGATCACTGGTACTTCCCAGGCCGA 60 X_hypoxylon_OUT ------CCTCGTGANTTCATCAAGAACATGATTACTGGTACCTCGCAAGCCGA 47 * ***** ***************** ******** ** ** *****

99027M_W_NA TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 119 99064M_W_NA TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 10096M_W_EU TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 10099M_W_EU TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 10149M_W_EU TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 99049M_W_NA TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 10083N_T_EU TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 99006N_W_NA TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 96103N_T_NA TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 99084N_W_NA TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 10106N_W_EU TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 10101N_T_EU TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 119 10107N_W_EU TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 119 99069N_W_NA TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 10152N_W_EU TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 10082N_T_EU TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 96101N_T_NA TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 96107N_T_NA TTGCGCCATTCTCATCATTGCCGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 120 X_hypoxylon_OUT TTGCGCCATTCTCATCATTGCTGCCGGTACTGGTGAGTTCGAGGCTGGTATCTCCAAGGA 107 ********************* **************************************

99027M_W_NA TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTCGGTGTCAAGCAGCTCATCGT 179 99064M_W_NA TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTCGGTGTCAAGCAGCTCATCGT 180 10096M_W_EU TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTCGGTGTCAAGCAGCTCATCGT 180 10099M_W_EU TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTCGGTGTCAAGCAGCTCATCGT 180 10149M_W_EU TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTCGGTGTCAAGCAGCTCATCGT 180 99049M_W_NA TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTCGGTGTCAAGCAGCTCATCGT 180 10083N_T_EU TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTTGGTTTCAAGCAGCTCATCGT 180 99006N_W_NA TGGCCAGACTCGCGAGCACGCTCTGCTCGCCTACACCCTTGGTGTCAAGCAGCTCATCGT 180 96103N_T_NA TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTTGGTGTCAAGCAGCTCATCGT 180 99084N_W_NA TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTTGGTGTCAAGCAGCTCATCGT 180 10106N_W_EU TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTTGGTGTCAAGCAGCTCATCGT 180 10101N_T_EU TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTTGGTGTCAAGCAGCTCATCGT 179 10107N_W_EU TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTTGGTGTCAAGCAGCTCATCGT 179 99069N_W_NA TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTTGGTGTCAAGCAGCTCATCGT 180 10152N_W_EU TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTTGGTGTCAAGCAGCTCATCGT 180 10082N_T_EU TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTTGGTGTCAAGCAGCTCATCGT 180 96101N_T_NA TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTTGGTGTCAAGCAGCTCATCGT 180 96107N_T_NA TGGCCAGACTCGTGAGCACGCTCTGCTCGCCTACACCCTTGGTGTCAAGCAGCTCATCGT 180 X_hypoxylon_OUT TGGCCAGACTCGTGAGCACGCTCTGCTCGCTTTCACCCTTGGTGTCAAGCAGCTCATCGT 167 ************ ***************** * ****** *** ****************

100

99027M_W_NA CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 239 99064M_W_NA CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 10096M_W_EU CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 10099M_W_EU CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 10149M_W_EU CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 99049M_W_NA CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 10083N_T_EU CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 99006N_W_NA CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 96103N_T_NA CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 99084N_W_NA CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 10106N_W_EU CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 10101N_T_EU CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 239 10107N_W_EU CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 239 99069N_W_NA CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 10152N_W_EU CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 10082N_T_EU CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 96101N_T_NA CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 96107N_T_NA CGCCATCAACAAGATGGACACCACCAAGTGGTCCGAGGCTCGTTTCCAGGAGATCATCAA 240 X_hypoxylon_OUT CGCTATCAACAAGATGGACACTGCCCAGTGGTCTGAGCAGCGTTTCAACGAGATTGTCAA 227 *** ***************** ** ******* *** ****** * ***** ****

99027M_W_NA GGAGACCTCCTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTCCC 299 99064M_W_NA GGAGACCTCCTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTCCC 300 10096M_W_EU GGAGACCTCCTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTCCC 300 10099M_W_EU GGAGACCTCCTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTCCC 300 10149M_W_EU GGAGACCTCCTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTCCC 300 99049M_W_NA GGAGACCTCCTCCTTCATCAAGAAGGTTGGCTACAACCCCAAGCAGGTCGCTTTCGTCCC 300 10083N_T_EU GGAGACCTCGTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTTCC 300 99006N_W_NA GGAGACCTCGTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTTCC 300 96103N_T_NA GGAGACCTCGTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTTCC 300 99084N_W_NA GGAGACCTCGTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTTCC 300 10106N_W_EU GGAGACCTCGTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTTCC 300 10101N_T_EU GGAGACCTCGTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTTCC 299 10107N_W_EU GGAGACCTCGTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTTCC 299 99069N_W_NA GGAGACCTCGTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTTCC 300 10152N_W_EU GGAGACCTCGTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTTCC 300 10082N_T_EU GGAGACCTCGTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTTCC 300 96101N_T_NA GGAGACCTCGTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTTCC 300 96107N_T_NA GGAGACCTCGTCCTTCATCAAGAAGGTCGGCTACAACCCCAAGCAGGTCGCTTTCGTTCC 300 X_hypoxylon_OUT GGAGACCTCTTCTTTCATCAAGAAGGTCGGTTTCAACCCCAAGACCGTTGCCTTCGTCCC 287 ********* ** ************** ** * ********** ** ** ***** **

99027M_W_NA CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 359 99064M_W_NA CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 10096M_W_EU CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 10099M_W_EU CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 10149M_W_EU CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 99049M_W_NA CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 10083N_T_EU CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 99006N_W_NA CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 96103N_T_NA CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 99084N_W_NA CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 10106N_W_EU CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 10101N_T_EU CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 359 10107N_W_EU CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 359 99069N_W_NA CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 10152N_W_EU CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 10082N_T_EU CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 96101N_T_NA CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 96107N_T_NA CATCTCCGGCTTCAACGGCGACAACATGCTCGAGGTTTCCACCAACGCCCCCTGGTACAA 360 X_hypoxylon_OUT CATCTCTGGTTTCAACGGCGACAACATGCTTGAGCTCACCAAGAACGCTCCCTGGTACAA 347 ****** ** ******************** *** * *** ***** ***********

99027M_W_NA GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCTGGCAAGACCCTTCTTGAGGC 416 99064M_W_NA GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCTGGCAAGACCCTTCTTGAGGC 417 10096M_W_EU GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCTGGCAAGACCCTTCTTGAGGC 417 10099M_W_EU GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCTGGCAAGACCCTTCTTGAGGC 417 10149M_W_EU GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCTGGCAAGACCCTTCTTGAGGC 417 99049M_W_NA GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCTGGCAAGACCCTTCTTGAGGC 417 101

10083N_T_EU GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCCGGCAAGACCCTTCTTGAGGC 417 99006N_W_NA GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCCGGCAAGACCCTTCTTGAGGC 417 96103N_T_NA GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCCGGCAAGACCCTTCTTGAGGC 417 99084N_W_NA GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCCGGCAAGACCCTTCTTGAGGC 417 10106N_W_EU GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCCGGCAAGACCCTTCTTGAGGC 417 10101N_T_EU GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCCGGCAAGACCCTTCTTGAGGC 416 10107N_W_EU GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCCGGCAAGACCCTTCTTGAGGC 416 99069N_W_NA GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCCGGCAAGACCCTTCTTGAGGC 417 10152N_W_EU GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCCGGCAAGACCCTTCTTGAGGC 417 10082N_T_EU GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCCGGCAAGACCCTTCTTGAGGC 417 96101N_T_NA GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCCGGCAAGACCCTTCTTGAGGC 417 96107N_T_NA GGGTTGGGAGAAGGAGATTGGC---GGCAACAAGTTCTCCGGCAAGACCCTTCTTGAGGC 417 X_hypoxylon_OUT GGGCTGGGAGAAGGAGGGTGCCAAGGGTGTCAAGATCAGCGGCAAGACTCTCCTCGACGC 407 *** ************ ** * ** **** ** ******** ** ** ** **

99027M_W_NA CATCGAC 423 99064M_W_NA CATCGAC 424 10096M_W_EU CATCGAC 424 10099M_W_EU CATCGAC 424 10149M_W_EU CATCGAC 424 99049M_W_NA CATCGAT 424 10083N_T_EU CATCGAC 424 99006N_W_NA CATCGAC 424 96103N_T_NA CATCGAC 424 99084N_W_NA CATCGAC 424 10106N_W_EU CATCGAC 424 10101N_T_EU CATCGAC 423 10107N_W_EU CATCGAC 423 99069N_W_NA CATCGAC 424 10152N_W_EU CATCGAC 424 10082N_T_EU CATCGAC 424 96101N_T_NA CATCGAC 424 96107N_T_NA CATCGAC 424 X_hypoxylon_OUT CATTGAT 414 *** **

102

Appendix 2.6 Alignment of ITS sequences from M. nivale and M. majus. Primer binding sites and RsaI restriction sites indicated by shading.

RsaI sites: G- 99027M_W_NA CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 99064M_W_NA CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 99049M_W_NA CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 99061M_W_NA CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 99007N_W_NA CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 10152N_W_EU CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 10102N_T_EU CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 10101N_T_EU CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 96103N_T_NA CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 10085N_T_NA CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 99084N_W_NA CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 96107N_T_NA CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 10083N_T_EU CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 10082N_T_EU CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 96101N_T_NA CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 99069N_W_NA CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 10151M_W_EU CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 10096M_W_EU CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 10099M_W_EU CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 10149M_W_EU CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGATGGTGCTGTCTCTCGGGACGG 60 07019_MB_OUT CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGTCGGTGCTG------GAAACAG 54 07020_MB_OUT CTCCAAACCATGTGAACTTACCACTGTTGCCTCGGTGGTCGGTGCTG------GAAACAG 54 ************************************** ******* * ** *

RsaI sites: -TAC 99027M_W_NA TACCACCGCCGGTGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 99064M_W_NA TACCACCGCCGGTGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 99049M_W_NA TACCACCGCCGGTGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 99061M_W_NA TACCACCGCCGGTGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 99007N_W_NA TGCCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 10152N_W_EU TGCCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 10102N_T_EU TGCCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 10101N_T_EU TGCCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 96103N_T_NA TGCCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 10085N_T_NA CGCCACCGCCGGGGAACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 99084N_W_NA CGCCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 96107N_T_NA CGCCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 10083N_T_EU CGCCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 10082N_T_EU CGCCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 96101N_T_NA CGCCACCGCCGGTGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 99069N_W_NA CGCCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 10151M_W_EU TACCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 10096M_W_EU TACCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 10099M_W_EU TACCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 10149M_W_EU TACCACCGCCGGGGGACTACCTAAACTCT-GTTAATTTTTGTCAA-TCTGAATCAAACTA 118 07019_MB_OUT TGCTGCCACCGGTGGACTAC-TAAACTCTTGTTAATTTTTGTCAAATCTGAATCAAACTA 113 07020_MB_OUT TGCTGCCACCGGTGGACTAC-TAAACTCTTGTTAATTTTTGTCAAATCTGAATCAAACTA 113 * ** **** * ***** ******** *************** **************

99027M_W_NA AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 99064M_W_NA AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 99049M_W_NA AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 99061M_W_NA AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 99007N_W_NA AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 10152N_W_EU AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 10102N_T_EU AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 10101N_T_EU AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 96103N_T_NA AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 10085N_T_NA AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 99084N_W_NA AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 96107N_T_NA AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 10083N_T_EU AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 103

10082N_T_EU AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 96101N_T_NA AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 99069N_W_NA AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 10151M_W_EU AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 10096M_W_EU AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 10099M_W_EU AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 10149M_W_EU AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 178 07019_MB_OUT AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 173 07020_MB_OUT AGAAATAAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCA 173 ************************************************************

99027M_W_NA GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 99064M_W_NA GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 99049M_W_NA GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 99061M_W_NA GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 99007N_W_NA GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 10152N_W_EU GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 10102N_T_EU GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 10101N_T_EU GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 96103N_T_NA GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 10085N_T_NA GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 99084N_W_NA GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 96107N_T_NA GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 10083N_T_EU GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 10082N_T_EU GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 96101N_T_NA GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 99069N_W_NA GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 10151M_W_EU GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 10096M_W_EU GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 10099M_W_EU GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 10149M_W_EU GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 238 07019_MB_OUT GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 233 07020_MB_OUT GCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGC 233 ************************************************************

99027M_W_NA ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 99064M_W_NA ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 99049M_W_NA ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 99061M_W_NA ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 99007N_W_NA ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 10152N_W_EU ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 10102N_T_EU ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 10101N_T_EU ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 96103N_T_NA ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 10085N_T_NA ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 99084N_W_NA ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 96107N_T_NA ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 10083N_T_EU ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 10082N_T_EU ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 96101N_T_NA ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 99069N_W_NA ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 10151M_W_EU ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 10096M_W_EU ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 10099M_W_EU ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 10149M_W_EU ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 298 07019_MB_OUT ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 293 07020_MB_OUT ACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCAACCCTTAA 293 ************************************************************

99027M_W_NA GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 99064M_W_NA GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 99049M_W_NA GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 99061M_W_NA GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 99007N_W_NA GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 10152N_W_EU GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 10102N_T_EU GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 10101N_T_EU GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 96103N_T_NA GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 10085N_T_NA GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 104

99084N_W_NA GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 96107N_T_NA GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 10083N_T_EU GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 10082N_T_EU GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 96101N_T_NA GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 99069N_W_NA GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 10151M_W_EU GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 10096M_W_EU GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 10099M_W_EU GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 10149M_W_EU GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 358 07019_MB_OUT GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 353 07020_MB_OUT GCCTAGCTTAGTGTTGGGAGACTGCCTAATACGCAGCTCCTCAAAACCAGTGGCGGAGTC 353 ************************************************************

99027M_W_NA GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 99064M_W_NA GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 99049M_W_NA GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 99061M_W_NA GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 99007N_W_NA GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 10152N_W_EU GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 10102N_T_EU GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 10101N_T_EU GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 96103N_T_NA GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 10085N_T_NA GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 99084N_W_NA GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGTAAGCCGGACTGGCAACAG 418 96107N_T_NA GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 10083N_T_EU GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 10082N_T_EU GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 96101N_T_NA GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 99069N_W_NA GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 10151M_W_EU GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 10096M_W_EU GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 10099M_W_EU GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 10149M_W_EU GGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGACTGGCAACAG 418 07019_MB_OUT TGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGCCAGACGACAG 413 07020_MB_OUT TGTTCGTGCTCTGAGCGTAGTAATTTTTTATCTCGCTTCTGCAAGCCGGCCAGACGACAG 413 **************************************** ******* * * * ****

99027M_W_NA CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 99064M_W_NA CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 99049M_W_NA CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 99061M_W_NA CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 99007N_W_NA CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 10152N_W_EU CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 10102N_T_EU CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 10101N_T_EU CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 96103N_T_NA CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 10085N_T_NA CCAAAAACCGCACCCCTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 99084N_W_NA CCAAAAACCGCACCCCTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 96107N_T_NA CCAAAAACCGCACCCCTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 10083N_T_EU CCAAAAACCGCACCCCTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 10082N_T_EU CCAAAAACCGCACCCCTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 96101N_T_NA CCAAAAACCGCACCCCTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 99069N_W_NA CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 10151M_W_EU CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 10096M_W_EU CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 10099M_W_EU CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 10149M_W_EU CCATAAACCGCACCCTTC--GGGGGCACTTTTT-AATGGTTGACCTCGGATCAGGTAGG- 474 07019_MB_OUT CCATAAACCGCACCCTCTCGGGGGGCACTTTTTTAATGGTTGACCTCGGATCAGGTAGG- 472 07020_MB_OUT CCATAAACCGCACCCTCTCGGGGGGCACTTTTTTAATGGTTGACCTCGGATCAGGTAGGG 473 *** *********** ************* *************************

99027M_W_NA AATACCCGCTGAACTTAAGCATA 497 99064M_W_NA AATACCCGCTGAACTTAAGCATA 497 99049M_W_NA AATACCCGCTGAACTTAAGCATA 497 99061M_W_NA AATACCCGCTGAACTTAAGCATA 497 99007N_W_NA AATACCCGCTGAACTTAAGCATA 497 10152N_W_EU AATACCCGCTGAACTTAAGCATA 497 10102N_T_EU AATACCCGCTGAACTTAAGCATA 497 105

10101N_T_EU AATACCCGCTGAACTTAAGCATA 497 96103N_T_NA AATACCCGCTGAACTTAAGCATA 497 10085N_T_NA AATACCCGCTGAACTTAAGCATA 497 99084N_W_NA AATACCCGCTGAACTTAAGCATA 497 96107N_T_NA AATACCCGCTGAACTTAAGCATA 497 10083N_T_EU AATACCCGCTGAACTTAAGCATA 497 10082N_T_EU AATACCCGCTGAACTTAAGCATA 497 96101N_T_NA AATACCCGCTGAACTTAAGCATA 497 99069N_W_NA AATACCCGCTGAACTTAAGCATA 497 10151M_W_EU AATACCCGCTGAACTTAAGCATA 497 10096M_W_EU AATACCCGCTGAACTTAAGCATA 497 10099M_W_EU AATACCCGCTGAACTTAAGCATA 497 10149M_W_EU AATACCCGCTGAACTTAAGCATA 497 07019_MB_OUT AATACCCGCTGAACTTAAGCATA 495 07020_MB_OUT AATACCCGCTGAACTTAAGCATA 496 ***********************

106

Chapter 3 Comparative Genomics

3.1 Introduction

3.1.1 General overview of whole-genome analyses As the availability of DNA and RNA sequencing data has increased rapidly in the past 30 years (GenBank 2012), researchers have developed new tools to allow them to search, compare, analyse, and manipulate increasingly complex data sets. Bioinformatics is a field that uses the tools of computer science to address these challenges in biology (Attwood et al. 2011). One use of bioinformatics is the assembly and analysis of whole-genome sequencing data (Pop and

Salzberg 2008).

The first eukaryotic genomes became available in the late 20th century (Goffeau et al.

1996), and were produced with Sanger sequencing (Sanger et al. 1977). However, sequencing technology has advanced rapidly, and a number of different sequencing strategies have become commercially available within the last decade (Shendure and Ji 2008). These techniques have reduced both the cost and the length of time required to produce large amounts of sequencing data (Lister et al. 2009). Genome sequences of a number of organisms have been produced using these technologies in recent years, particularly in species where reference genomes are available

(e.g. (Hillier et al. 2008; Ossowski et al. 2008)).

Fungi present an interesting and tractable challenge for de novo whole genome sequencing using next generation methods (Haridas et al. 2011). The de novo sequencing of fungal genomes has already been accomplished using next-generation sequencing alone

(Nowrousian et al. 2010). Filamentous ascomycetes are haploid in their vegetative state

(Alexopoulos et al. 1996), and the conidia and ascospores of most ascomycota are usually uninucleate (Alexopoulos et al. 1996). These factors facilitate the collection of genetically

107

homogeneous material and the sequencing of single ascomycetous isolates. Furthermore, the genomes of most filamentous ascomycetes are smaller than 37 Mb (Gregory et al. 2007), which facilitates both the collection of sufficient data for successful assembly and renders the assembly of these sequence fragments computationally feasible.

3.1.2. Sequencing platforms Until recently, most sequencing was performed using the Sanger method, which, when first developed, amplified a sequence of interest in the presence of both normal deoxynucleic acids (dNTPs) and radioactively-labelled chain-terminating dideoxynucleic acids (ddNTPs). This process produced a pool of oligonucleotides of various lengths that were separated on agarose gels and visualized using x-ray film (Sanger et al. 1977). Sanger sequencing was updated through the use of fluorescently-labelled dNTPs, eliminating the need for radioactivity, and facilitating the automation of detection (e.g. (Ansorge et al. 1986)). Sanger sequencing produces reads that are approximately 1 kb in length (Shendure and Ji 2008).

Although Sanger sequencing is still common, especially for the sequencing of short DNA molecules (such as plasmids and PCR products), newer sequencing technologies have become common in large-scale sequencing projects (e.g. (Li et al. 2009b)). Like Sanger sequencing,

Illumina-Solexa sequencing is a sequencing-by-synthesis method, but it has the additional benefit of producing results in real-time (Metzker 2010). The process of Illumina-Solexa sequencing (also abbreviated as SBS, for sequencing-by-synthesis) is illustrated in Figure 3.1. In

SBS, as in other next-generation sequencing methods, genomic DNA is first sheared into random fragments (Nielsen et al. 2011). Both ends of the fragments are ligated to adapter segments of

DNA, which in turn are complementary to short nucleotide sequences that are attached to a solid

108

surface, effectively binding the strand to be sequenced to the solid surface (Metzker 2010). Prior to sequencing, the strands are amplified by PCR: because the adapter sequences are complementary to the sequences bound to the solid surface, the adapter segments act as short, double-stranded fragments for DNA polymerase (Figure 3.1). When the amplification is complete, the result is "clusters" of identical sequence fragments. To sequence the fragment, the four fluorescently-labelled dNTPs are introduced into the reaction mixture. Because the dNTPs have been chemically modified to block the 3' hydroxyl group, only a single nucleotide at a time can be introduced (Mardis 2008). When the reaction mixture is washed away, the identity of the nucleotide that has been added is read, the blocking group is cleaved away, and the sequencing procedure is repeated (Mardis 2008). The length of the reads produced by SBS has increased sharply, from 36 bp in 2008 (Shendure and Ji) to 150 bp in 2012 (Quail et al.).

Several other next-generation sequencing platforms also exist. Pyrosequencing, a technique employed by 454 sequencing, uses the pyrophosphate molecule released during the incorporation of a nucleotide onto a growing strand of DNA to generate a "flash" of light using the enzyme luciferase (Mardis 2008). The order of the sequence in question is determined by introducing a single dNTP species at a time into the reaction mixture, and simply recording those which induce a flash of light (Metzker 2010). In 2005, the read length produced by this platform was already 100-150 bp, and as of 2012 it is 700 bp (Liu et al. 2012).

Ion Torrent sequencing (also known as ion semiconductor sequencing) shares some similarities with 454 sequencing. Whereas 454 sequencing is dependent upon the detection of pyrophosphate when a nucleotide is introduced into a growing chain, Ion Torrent sequencing detects the release of a proton (i.e. a hydrogen ion). When a proton is released, the pH of the reaction mixture decreases slightly. This change in acidity is thus detected only when the correct 109

dNTP solution is introduced onto the reaction surface. As of 2012, the read length of ion torrent sequencing was approximately 200 bp (Quail et al. 2012).

In sequencing by ligation (SOLiD sequencing), as in SBS, the DNA is first cleaved into small fragments and ligated to adapter sequences. A primer sequence that is complementary to the adapter is introduced. Next, fluorescently-labeled priming sequences are introduced into the reaction mixture. These eight-base-pair long (octameric) priming sequences consist of two nucleotides, followed by a set of five nucleotides ligated to a fluorophore. If the first two nucleotides are complementary to the strand to be sequenced, they will bind, and the flurophore will be detected (Mardis 2008). The octamer is cleaved after the fifth base, and the next octamer is introduced. In this way, the nucleotides at the 1st and 2nd, the 6th and 7th, 11th and 12th positions, and so on, will be determined. The procedure is then repeated, but rather than using a priming sequence that is the full length of the adapter sequence, a primer that is one bp shorter is used. In this way, the first octamer will reveal the 0th (adapter) and the 1st bases, and subsequent rounds will determine identities of the 5th and 6th and the 10th and 11th positions, and so on.

This entire process is repeated until the full sequence of the fragment has been determined

(Mardis 2008), and the maximum read length in 2012 was 50 bp (Quail et al.).

Although all four of these four next-generation methods are currently in use (in addition to several other methods not discussed here), each one offers distinct advantages and disadvantages in terms of the overall cost, speed, and accuracy of the sequencing results. Due to the short read lengths of all next-generation sequencing technologies relative to Sanger sequencing, one disadvantage shared by all of these methods is the difficulty in re-assembling repetitive regions (Alkan et al. 2011). Because each nucleotide is queried twice, SOLiD sequencing has a high level of accuracy relative to other next-generation methods (Metzker 110

2010). However, this same benefit also increases both the duration and the cost of a sequencing run performed using SOLiD sequencing (Metzker 2010). Pyrosequencing offers long read lengths relative to the other next-generation methods (up to 1 kb as of 2013 (Roche Diagnostics

Corporation 2013)), but the necessary reagents are expensive and the method has a higher error rate for single-nucleotide repeats ("homopolymers"; (Metzker 2010)). Ion torrent sequencing generates data extremely quickly, but otherwise shares the same problems as pyrosequencing and has a read length comparable to other next-generation methods (Quail et al. 2012). By comparison, a single run of SBS is slower, but less expensive than many of the other next- generation sequencing options (Metzker 2010; Quail et al. 2012). Sequencing by synthesis generates a large number of reads in each run, permitting deep coverage for genomic sequencing, which facilitates later assembly (Quail et al. 2012). For these reasons, SBS has been successfully used in the de novo sequencing of organisms such as Ailuropoda melanoleuca (the giant panda)

(Li et al. 2009b) and Homo sapiens (Simpson et al. 2009).

3.1.3 Genome assembly and protein prediction The short length of the reads produced by next-generation sequencing techniques creates a challenge for genome assembly similar to the way that a puzzle with more pieces is more difficult to assemble than one with fewer pieces. The use of paired-end sequencing attempts to mitigate this issue. Paired-end sequencing involves sequencing a fragment of known length from both ends (Fullwood et al. 2009). Because the approximate full length of the DNA fragment is known, as well as the length of the two sequenced regions, these sequences can be re-assembled more easily because the distance between them is known (Figure 3.2).

111

In addition to the advantage offered by paired-end sequencing, most freely-available sequencing programs, including SOAPdenovo (Li et al. 2009c; Li et al. 2008), Velvet (Zerbino and Birney 2008), and ABySS (Simpson et al. 2009), use de Brujin graphs to associate short groups of aligned reads (Figure 3.3). Before preparing a de Brujin graph, the raw, short reads are aligned (Zerbino and Birney 2008). The alignments comprise the nodes of the de Brujin graph, and the nodes are then associated with one another based on their similarities (i.e. potential for overlap) (Zerbino and Birney 2008). In this way, small "pockets" of similarity are associated, and a longer sequence is thus assembled based on the alignments of the short reads (Zerbino and

Birney 2008).

Although the goals of any individual study may vary, the prediction of protein-coding sequences is often an important prerequisite to whole-genome comparison. The identification of protein-coding sequences may facilitate a variety of comparisons, including the identification of novel genes that may be related to the lifestyle of an organism of interest (O’Connell et al. 2012).

The identification of gene families that contain duplications or deletions may also help to clarify evolutionary relationships (Chang and Duda 2012). In addition, the examination of an organism's protein-coding sequences may reveal the genetic mechanisms responsible for an organism's unique abilities, such as cool-temperature survival (Methé et al. 2005) or pathogenicity

(O’Connell et al. 2012).

Although the basic concept of identifying protein-coding regions within the genome may appear simple on the surface, the problem of identifying only the functional protein sequences from among the millions of base pairs in a sequenced genome is not trivial. For example, the frequency (and even the presence) of introns may vary between taxa (Stajich et al. 2006), and the

112

presence of pseudogenes (truncated or otherwise frame-shifted former protein-coding sequences)

(Nelson and Cox 2004) may confound algorithms designed to detect protein-coding sequences.

Many of the gene-finding programs developed to date are based on hidden Markov models (e.g.(Borodovsky and Lukashin 1998; Stanke et al. 2004)). A Markov chain is a statistical model that describes the probability of a system assuming any particular state in the future, given its current state; the probability that the system will assume any of the possible state identities in the future is dependent only on the current state of the system (Stamp 2012). In a hidden Markov model (HMM), a Markov process is occurring wherein the identity of the state is hidden, but the outcome resulting from this state is observable (Stamp 2012). For example, in the context of gene discovery, the relative frequencies of the four nucleotides in coding regions of

DNA may differ from that in non-coding regions, and the identity of these hidden states (i.e. as coding vs. non-coding) for a given segment of DNA can be elucidated through the observed pattern of nucleotides and the known probabilities of transitioning between each nucleotide within either state and between the two states themselves (Eddy 2004). A more accurate model to predict the state of a given sequence as coding or non-coding would incorporate more than one observation (i.e. nucleotide); this is the basis of a generalized hidden Markov model (GHMM), wherein a single (hidden) state produces multiple observations (Stanke 2003). The gene-finding program AUGUSTUS is based on this model and uses a string of nucleotides to estimate the current state of the string as one of a multitude of possible states (e.g. an intron, an intron-exon boundary, promoter region, etc.) (Stanke 2003). Outside of pure de novo prediction methods, gene predictions may also be based on the sequencing information available for related species or, in some cases, previously-sequenced members of the same species (Sleator 2010);

AUGUSTUS also uses the coding sequences of related organisms to guide its predictions. 113

3.1.4 Whole-genome comparisons As genome sequencing technology has become more widely accessible, the genomes of several hundred fungal species (Choi et al. 2013) have been published. Some of these fungal species include pathogens of plants and animals, and some researchers have sequenced two or more representatives of related groups to search for trends in the genomic origin of pathogenicity and in the overall arrangement of these fungal genomes (e.g. (Amselem et al. 2011; Gao et al.

2011; Jackson et al. 2009; Schirawski et al. 2010; Sharpton et al. 2009)). Among several sequenced pairs of species, including the insect pathogens Metarhizium anisopliae and M. acridium (Gao et al. 2011), the human pathogens Candida albicans and C. dubliniensis (Jackson et al. 2009), the plant pathogens Colletotrichum graminicola and C. higginsianum (O’Connell et al. 2012), Sclerotinia sclerotiorum and Botrytis cinerea (Amselem et al. 2011), and the maize pathogens Ustilago maydis and Sporisorium relianum, high degrees of synteny were reported between each pair. Similarly, when three species of Aspergillus were sequenced, synteny in both coding and non-coding regions of the genome was reported between all species (Galagan et al.

2005).

In each of their comparisons between the genomes of pathogenic fungal species, the researchers investigated pathogenic differences and the genetic basis of host specificity among the species studied by identifying predicted genes with putative pathogenic functions. For example, although both are entomopathogens, M. acridium is a locust-specific pathogen whereas

M. anisopliae attacks a variety of insect species. Despite this difference, the genomes shared nearly 90% amino acid homology among their predicted genes (Gao et al. 2011). Both also possessed several predicted proteins that were homologous to reported plant pathogenesis-related

114

proteins (such as hydrophobins), in addition to proteases that are predicted to play a role in the degradation of the cuticle of insects. There were no apparent differences in the relative amounts of types of degradative enzymes present in the genome that could readily explain the observed host preferences. In contrast, the maize pathogens Ustilago maydis and Sporisorium relianum possessed dissimilar effector proteins, which may reflect their differences in tissue type preference and infection symptoms (Schirawski et al. 2010). The hemibiotrophic pathogens

Colletotrichum higginsianum and C. graminicola also possessed differences in their predicted proteins related to their functions as pathogens (O’Connell et al. 2012). The genome of C. higginsianum contained more than twice as many pectinases as the genome of C. graminicola; in contrast, the genome of C. graminicola enocded a more diverse family of cellulases than those found in C. higginsianum. These differences correlate with each species' preferred host: whereas

C. higginsianum is primarily a pathogen of dicots, which possess cell walls that are richer in pectin, C. graminicola is a pathogen of grasses and cereals (monocots), which have cell walls that are rich in cellulose.

In the necrotrophic plant pathogens Sclerotinia sclerotiorum and Botrytis cinerea

(Amselem et al. 2011), both genomes shared homologs of pathogenesis-related genes, such as genes involved in programmed cell death in plants. Similarly, both possessed more genes related to oxidative phosphorylation than other ascomycetes, which the authors suggested may be related to their production of oxalic acid during the infection process. However, both genomes carried fewer copies of pectinases, cellulases, and hemicellulases than most other plant pathogenic fungi surveyed, such as Magnaporthe oryzae and Giberella zeae.

Trends in the presence and absence of pathogenicity-related genes were also detected when the genomes of the human fungal pathogens Coccidioides immitis and C posadasii were 115

compared to those of the non-pathogenic but closely-related Uncinocarpus reesii and the pathogenic but more distantly-related Histoplasma capsulatum (Sharpton et al. 2009). All four species belong to the Onygenales and were compared to the genomes of previously-sequenced members of the Sordariomycetes, including several species of Aspergillus and Penicillium, which are both members of the Eurotiales. In these comparisons, all members of the Onygenales, including the phytopathogenic species, were either lacking entirely or possessed far fewer predicted proteins homologous to protein families that are directly related to plant degradation

(e.g. cellulases, cutinases, pectin lyases) relative to the Eurotiales studied. However, the genomes of the pathogenic Coccidioides species studied possessed a large number of serine proteases and keratinases relative to the other species examined, which may be related to their role as animal pathogens, rather than soil saprobes.

Whole-genome sequencing has also identified trends in the physical arrangement of the genomes of pathogenic strains relative to non-pathogenic strains. As early as 1992, the Fot1 transposable element, detected in Fusarium oxysporum, was proposed as an important source of genetic variation (Daboussi et al.). Transposable elements (TEs) are fragments of DNA that can change their position within a genome (Hartl and Clark 2007) (chapter 4). There are two general classes of transposable elements, denoted as class I and class II. Class I elements are also called retrotransposons because the TE sequence that was previously integrated into the genome is transcribed into RNA before being reverse-translated back into DNA to be re-integrated into the genome at a different position (Daboussi and Capy 2003). Class II elements move throughout the genome by being excised as DNA and simply re-locating to a new position (Daboussi and Capy

2003). Members of both class I and class II TEs have been identified in filamentous

Ascomycetes (Daboussi and Capy 2003) and transposon-rich regions are physically associated 116

with genes that confer host specificity in several Ascomycete plant pathogens, including

Magnaporthe oryzae (Dean et al. 2005; Thon et al. 2006), Alternaria alternaria (Hatta et al.

2002), and Verticillium spp. (Amyotte et al. 2012).

Furthermore, large-scale genomic differences have been reported between filamentous fungi with differing plant hosts. When the genomes of Fusarium oxysporum f. sp. lycopersici, F. verticillioides, and F. graminearum were sequenced, F. oxysporum was found to contain 19 Mb of sequences that were not shared with the other Fusarium species examined (Ma et al. 2010).

These sequences contained 74% of the TE sequences identified in this species, and although some of these sequences were found on chromosomes that were shared with the other species, the majority were organized into four unique chromosomes. In addition to containing the majority of TEs, the unique segments of the F. oxysporum genome also contained a high concentration of predicted genes that shared a high level of sequence identity with effectors, virulence factors, and proteins that are involved in signal transduction (i.e. genes that may play a role in pathogenicity). When the genome of the F. oxysporum f. sp. lycopersici isolate sequenced was compared to those of different strains of F. oxysporum (one a pathogen of Arabidopsis and one belonging to f. sp. vasinfectum), the genes within the unique region were not shared with the other strains examined, suggesting that these genes may play a key role in host-specific pathogenicity. Taken together, this evidence suggests that the comparison of closely-related genomes may permit the identification of expanded gene families that may help to explain some of the unique characteristics of the pathogenicity of a species of interest.

117

3.1.5 Objectives The primary objective of this project was to obtain and assemble whole-genome sequences for isolates of M. nivale and M. majus. These genome sequences were compared to each other and to the whole-genome sequences of other filamentous ascomycetes to determine the amount of variation within and between different species of Microdochium. The whole- genome information was used to design a single primer set that amplifies both species but distinguishes between them by producing amplicons of different sizes.

3.2 Materials and Methods

3.2.1 DNA extraction, quantification, and sequencing Mycelium for DNA extraction was prepared as described in Chapter 2. The genomic

DNA for NGS was extracted using either the Qiagen DNeasy Plant Mini Kit (Qiagen,

Mississauga, Canada) (M. majus isolate 99049 and M. nivale isolate 11037) or the PowerSoil

DNA isolation kit (Mo Bio, Carlsbad, CA, USA) (M. nivale isolates 12262 and 10106, M. majus isolate 10095, and M. bolleyi isolate 07020). The M. bolleyi DNA was extracted by Fang Shi

(Hsiang lab) using the Qiagen method. For both methods, a total of 200 mg of fungal tissue was processed in two separate 100 mg batches for each of the isolates sequenced. The Qiagen extraction was performed according to the manufacturer's instructions with the following modifications. During the tissue homogenization step, an initial volume of 200 µL of buffer AP1 was added to a tube containing 100 mg of fungal tissue and approximately 50 mg of sterile, acid- washed sea sand (Fisher, Fair Lawn NJ, USA). The mycelium was then homogenized as described in Chapter 2. An additional 200 µL of buffer AP1 was added to the tube before proceeding with the remaining steps described in the manufacturer's protocol. During the final

118

elution step, DNA was eluted with two washes of 50 µL rather than a single 100 µL rinse. The

PowerSoil extraction was performed according to the manufacturer's instructions following the

"alternative lysis method" with the following modifications. During each of the three vortex steps described in the alternative lysis protocol, the samples were vortexed for 20 s, rather than

3-4 s. The final elution was performed using 50 µL of buffer per sample, and the moistened columns were incubated at room temperature for 10 minutes before the final centrifugation.

Following extraction, all DNA was stored at -20 °C. The quality and quantity of the DNA was assessed by electrophoresis through agarose gels as described in Chapter 2.

The quantity of DNA sent for sequencing and the sequencing facility utilized for each reaction is described in Table 3.1. For M. majus isolate 99049, an Illumina Genome Analyser IIx platform was used specifying a single full lane of 75 bp paired-end sequencing. For all other genomes sequenced, an Illumina HiSeq 2000 platform was used specifying 100 bp paired-end sequencing from 1/4 to 1/8 of single lanes (multiplexed).

3.2.2 Genome assembly and gene prediction Genome assemblies and analysis were conducted using either SHARCNET (Shared

Hierarchical Academic Research Computing NETwork; www.sharcnet.ca), WestGrid

(www.westgrid.ca), or local UBUNTU 12.01 servers with 16 gb (HP16) or 32 gb (GIGA32) of

RAM. Where applicable, sequencing data obtained in BAM format were converted to FASTQ using the program SAMtools (Li et al. 2009a) prior to further analysis. All of the genomes studied were assembled using the programs ABySS v. 1.3.4 (Simpson et al. 2009) and

SOAPdenovo v 1.05 (Li et al. 2008) and GapCloser v. 1.12 (Li et al. 2008) with odd-numbered kmers between 29 and 63. Assembly quality was assessed by examining the N50 value and by

119

examining the total number of scaffolds produced by the program. The N50 value is a statistic that provides an estimate of assembly quality. This value is calculated by sorting all contigs (or scaffolds) by size and then identifying the contig size at which 50% of all bases in the assembly are included in contigs that are larger and smaller than this value (Haridas et al. 2011). A higher

N50 value reflects larger contigs or scaffolds. The word contig is an abbreviation of contiguous, and refers to the sequences that are produced from overlapping reads in a sequence assembly

(Haridas et al. 2011). By comparison, scaffolds are composed of neighbouring contigs that are a known distance apart based on paired-read information, but that may be joined by unknown bases due to insufficient sequencing coverage of the adjoining regions (Haridas et al. 2011).

For M. majus isolate 99049, two separate rounds of assemblies were performed. The first assembly was performed using the methods described above, using the raw sequencing reads. In the second round of assemblies, the contigs from the highest N50 assembly of the first round were used as single-end inputs and assembled again. Predicted genes were prepared for the assemblies of each genome with highest N50 using AUGUSTUS v.2.5.5 (Stanke et al. 2004).

The predicted gene set of Magnaporthe grisea (included in the program) was used to train the algorithm. The predicted genes were annotated using the web interface of FastAnnotator

(available at http:// http://fastannotator.cgu.edu.tw; (Chen et al. 2012)). Additional detail was added to the annotations by comparing the putative protein family (Pfam) accession numbers

(Punta et al. 2012) assigned to the predicted genes to the full Pfam database downloaded from http://pfam.sanger.ac.uk. The script "annotate_genes.pl" was used to format the annotated sequences. An example of the scripts and configuration files used for assembly, gene prediction, and annotation are found in Appendix 3.2, Appendix 3.3, Appendix 3.4, Appendix 3.5, and

Appendix 3.6. 120

3.2.3 Whole-genome comparisons and identification of unique genes BLAST databases for the predicted gene sets and the assembled scaffold sequences were prepared using the command "makeblastBLASTdb" included with the standalone BLAST package BLAST+ v.2.2.25 (Altschul 1990). To identify predicted genes that are unique to M. nivale and / or M. majus, the predicted gene sets of the sequenced Microdochium genomes were searched against each other and against a set of fifteen other Sordariomycete genomes, including six members of the Xylariales (Table 3.7) using the tBLASTn algorithm. The resulting data were parsed (parse_m9.pl; Appendix 3.7) to remove comments. For each of the Microdochium genomes studied, the results were then summarized in a matrix (make_simple_table_v2.pl,

Appendix 3.8) wherein the presence or absence of each predicted gene in all of the other genomes and / or genome categories (the Microdochium, Xylariales, and other Sordariomycetes) was recorded as either present (1) or absent (0). This script also applied a filter to exclude predicted genes that were shorter than 100 amino acids (or 300 bp) in length and to impose a maximum e-value score. For these comparisons, three different maximum e-value thresholds were used: 1e-05, 1e-20, and 1e-50. Finally, this table was queried (summarize_with_files_v2.pl,

Appendix 3.9) to identify genes that were unique to the given genome, or that were only found within certain categories of interest (e.g. only among M. nivale isolates from wheat).

To assess the relationships within Microdochium, the full gene sequences (including introns) of ten predicted protein sequences that were apparently unique to Microdochium (based on the methods described above) were collected from the genomes of all of the Microdochium isolates sequenced. These sequences were concatenated and aligned using ClustalX as described in Section 2.2.5, and maximum likelihood, neighbour-joining, and maximum parsimony trees were constructed using PAUP* v. 4.0 to visualize the relationships between the isolates

121

sequenced. For each tree, 100 bootstrap replicates were calculated. The identities of these genes are listed in Table 3.10 and their alignment is found in Appendix 3.15.

In addition to these comparisons, the synteny between the whole genomes of M. majus isolate 99049 and M. nivale isolates 11037 and 12262 was investigated using the program Mauve v. 2.3.1 (Darling et al. 2004). Prior to this analysis, each of the Microdochium genomes was independently aligned in Mauve to the genome of Magnaporthe grisea (isolate 70-15, assembly version 6), downloaded from the Broad Institute, http://www.broadinstitute.org/). These "pre- aligned" genomes were then re-aligned together (without M. grisea).

During the multiple-gene comparisons performed in Chapter 2, the M. bolleyi EF-1α sequence could not be amplified using a combination of the primers EFNivF, EFMicF, and

EFMicR. To investigate this hypothesis, a putative EF-1α sequence was identified in the M. bolleyi predicted gene set by using the M. nivale and M. majus partial EF-1α sequences amplified in Chapter 2 to query the M. bolleyi 07020 predicted gene set using the BLASTx algorithm. The

EFNivF and EFMicF primer sequences, as well as the reverse complement of the EFMicR primer sequence were aligned with the putative M. bolleyi EF-1α nucleotide sequence using

ClustalX.

3.2.4 Design of species-specific primers The nucleotide sequences of the predicted genes of M. nivale isolate 11037 were used to query a BLAST database constructed from the nucleotide sequences of the M. majus 99049 genome. The output of this search was parsed using a script (find_genes_of_diff_length.pl,

Appendix 3.10) to identify genes that had both a maximum e-value of 1E-05 and a minimum length difference of 50 bp. The full sequences of genes that met these criteria were extracted

122

from the genome (including both introns and exons), and were aligned using ClustalW. The alignments were sorted manually to assess both the quality of the match and the presence or absence of total length differences. Primers were designed (Section 2.2.6) for candidates which possessed both a) highly conserved regions that were at least 300 bp apart (for primer design purposes), and b) a sequence length difference of at least 50 bp between the highly conserved regions. These primers were synthesized by Laboratory Services Division, University of Guelph

(Guelph, Canada) and were tested according to the protocol for the ITS PCR (Section 2.2.4), with the exception of the annealing temperature. A list of the primers tested and their annealing temperatures are found in Table 3.5. Each of the primers was tested with at least two M. majus isolates (one each from North America and from Europe), and at least two M. nivale isolates (one each from wheat and from turf).

3.2.5 Identification of putative pathogen-related genes A list of potential pathogen-related genes were identified in the six sequenced

Microdochium genomes using a list of pathogen-host interaction (PHI) genes obtained from PHI- base (Winnenburg et al. 2006). These sequences were queried against BLAST databases constructed from the predicted protein sequences using the BLASTp algorithm. The presence or absence of these sequences among the predicted proteins was then assessed using the script

"compare_phi_results.pl" (Appendix 3.11). The number of matches for each gene was assessed, and all matches with a length of at least 100 bp and an e-value of 1e-25 or lower were tabulated.

Where possible, the function of genes with a large difference in the number of matches among the Microdochium genomes were noted (Table 3.9).

123

3.2.6 Identification of putative transposable elements Where available, the protein sequences of transposable elements previously identified in other filamentous ascomycetes was downloaded from GenBank. Nucleotide sequences were downloaded for some sequences when protein sequences were unavailable. These sequences were selected to represent the major families of Class I and Class II elements that have been reported from filamentous ascomycetes. A list of the sequences used is found in Table 3.11.

These sequences were queried against the BLAST databases that had been constructed from the scaffold genome assemblies of all six Microdochium sp. isolates examined using the tBLASTx or tBLASTn alogorithm as appropriate. All matches with e-values of less than 1e-05 were collected for further analysis. To avoid counting the same putative TEs as hits to more than one of the query sequences, the hits in the genome were only counted as putative TEs if they were not within 500 bp of another hit. The positions of these putative TEs relative to the putative PHI genes identified as described above (Section 3.2.6) were compared using the script

"check_proximity.pl" (Appendix 3.12), and all putative PHI genes that were within 5 kb of a putative TE were noted. Duplicate matches of similar TE-PHI pairs on the same scaffold were eliminated using the script "strip_duplicates.pl" (Appendix 3.13).

3.3 Results

3.3.1 Genome sequencing, assembly, and protein prediction The whole-genome sequences of M. majus isolates 99049 and 10095, M. nivale isolates

11037, 12262, and 10106, and M. bolleyi isolate 07020 were obtained as raw reads in either

FASTQ (99049, 10095, 12262 10106, 07020) or BAM (11037) format. The total number of raw reads obtained for each genome is summarized in Table 3.3.

124

The genome assembly programs Velvet, SOAPdenovo, and ABySS were used to assemble the genome of M. majus isolate 99049. Of these three programs, Velvet yielded the poorest results (Table 3.4) based on the N50 values and the total genome size. This program was not used to assemble the other genomes, which were obtained later. For all genomes, kmers 25-

63 were successfully assembled by both SOAPdenovo and ABySS. The genome sizes of both M. majus isolates were 35.9 Mb. For M. nivale turf isolate 11037, the draft genome sequence was

37.0 Mb in length; for the wheat isolates 12262 and 10106, the genome sizes were 36.7 and 37.1

Mb, respectively. The genome of M. bolleyi isolate 07020 was 38.2 Mb in length.

The program AUGUSTUS was used to predict protein sequences from all of the genomes studied, and the total number of proteins predicted from the best assemblies of each of the genomes are summarized in Table 3.3. For each of the genomes, only the assembly with the largest N50 was used for protein prediction. All of the M. nivale and M. majus isolates sequenced, with the exception of M. nivale turf isolate 11037, had between 11.3 to 11.7x103 predicted proteins. The M. nivale turf isolate had 12x103 predicted proteins, and the M. bolleyi isolate had 13x103 predicted proteins. The number of predicted genes from each genome that were successfully annotated using FastAnnotate is noted in Table 3.3.

3.3.2 Whole-genome comparisons Several comparisons were made between and among the whole-genome data of the six

Microdochium genomes sequenced in these experiments. These data were also compared to whole-genome data from seven species of the Xylariales and thirteen other non-Xylariales

Sordariomycetes. The results of these comparisons, with e-value maxima of 1e-05, 1e-20, and

1e-50, are found in Table 3.5, Table 3.6, and Table 3.7, respectively.

125

At an e-value maximum of 1e-05, the two M. majus genomes, 10095 and 99049, shared

99.8 and 99.7%, respectively, of all genes. When the e-value minimum was decreased to 1e-50, these species shared 89.4 and 92.7% of all genes. For the genomes of M. nivale isolates 12262 and 10106, which were both isolated from wheat, 97.9 and 98.0% of genes were shared at an e- value minimum of 1e-05. At an e-value of 1e-50, 92.9 and 89.4 % of all genes, respectively, were shared between these isolates. The M. nivale turf isolate 11037 shared 92.2% of its genes with both of the M. nivale wheat isolates at an e-value minimum of 1e-05. At an e-value minimum of 1e-50, this isolate shared 78.5% of its genes with the two M. nivale wheat isolates.

Between M. nivale and M. majus, at an e-value of 1e-05, isolate 11037 shared 91.6% of its genes with both M. majus genomes; the M. nivale wheat isolates 12262 and 10106 shared 95.5% and

92.9% of their genes with the M. majus isolates, respectively. The M. majus isolates 99049 and

10095 shared 92.5 and 89.9% of their genes with 11037. Isolate 99049 shared 95.3% with both

12262 and 10106, and 10095 shared 93.2% with both M. nivale wheat isolates. At an e-value minimum of 1e-50, isolate 11037 shared 78.0% of its genes with both M. majus isolates. The wheat isolates 12262 and 10106 shared 81.8 and 78.9% of their genes with the two M. majus isolates. The M. majus isolates 99049 and 10095 shared 79.7 and 77.1% of their genes, respectively, with turf isolate 11037. Isolate 99049 shared 82.6% and 10095 shared 79.7% of its genes with the two M. nivale wheat isolates.

At an e-value minimum of 1e-05, the M. bolleyi isolate shared 81.0% of its genes with M. nivale turf isolate 11037, 80.3% with both of the M. nivale wheat isolates, and 80.5% of its genes with the M. majus isolates. At an e-value minimum of 1e-50, the M. bolleyi isolate shared 67.7% of its genes with M. nivale turf isolate 11037, 66.8% of its genes with both the M. nivale wheat isolates, and the M. majus isolates. 126

In addition to these comparisons, ten predicted genes that were not found in any of the

Sordariomycete genomes examined but that were found in all of the Microdochium genomes were concatenated and were used to construct maximum likelihood (ML), neighbour-joining

(NJ), and maximum parsimony (MP) trees, each with 100 bootstrap replicates (Figure 3.4). In all three trees, the two M. majus isolates formed a single clade with 100% bootstrap support.

Similarly, the two M. nivale isolates from wheat also formed a single clade with 100% bootstrap support in all trees. The M. nivale turf isolate 11037 did not group with the other M. nivale isolates in any of the trees. In both the NJ and the ML trees, the isolate grouped with the M. majus isolates (with 61 and 87% bootstrap support, respectively). In the MP tree, the isolate formed a tricotomy with the M. bolleyi isolate and the clade containing the remaining isolates. In all three trees, the M. bolleyi isolate formed either a dichotomy (NJ and ML) or a trichotomy

(MP) with the other Microdochium isolates included in the analysis. In all three trees, the relative branch lengths linking the two M. majus isolates were shorter than the branch lengths for the M. nivale wheat isolates.

A comparison of the whole-genome synteny among the Microdochium genomes was attempted using the program Mauve; however, the inability to assemble the Microdochium spp. genomes into chromosomes prevented the extraction of meaningful results from this comparison.

Despite pre-aligning the genomes with the genome of M. grisea, the synteny of the

Microdochium genomes could not be assessed.

3.3.3 Development of species-specific primers Of the 228 predicted gene sequences aligned, 14 were selected as good candidates for further investigation based on their possession of two conserved regions (appropriate for primer

127

design) that were separated by at least 300 bp, as well as sufficient indels to yield a length difference of at least 50 bp in between these two conserved regions. Primers were designed and ordered for seven of these genes. The results of the primer tests are summarized in Table 3.5. Of the primer sets tested, only Med5 and 371 yielded results that differed consistently between the two species of interest.

3.3.4 Identification of pathogenesis-related genes A total of 2,614 pathogen-host interaction (PHI) genes (Winnenburg et al. 2006) were searched against the predicted protein sequences of all of the Microdochium genomes. A total of

1,831 genes were shared among all of the Microdochium predicted gene sets examined. Of the

2,614 genes included in the analysis, 451 were not identified in any of the predicted gene sets.

In M. bolleyi isolate 07020, 1,873 of the PHI genes were found, including nine that were not shared with any of the other species examined; conversely, the other five predicted gene sets shared a total of eight genes that were found in all predicted gene sets except M. bolleyi. In M. nivale turf isolate 11037, 1,880 PHI genes were found, including five that were not shared with any of the other species examined. In M. nivale wheat isolates 12262 and 10106, 1,870 and 1,866

PHI genes, respectively, were identified. One gene was identified in isolate 12262 that was not shared with any of the other Microdochium predicted gene sets; there were no unique genes in isolate 10106. There were four PHI genes that were found in both of the M. nivale wheat isolates that were not shared with any of the other isolates examined. In M. majus isolates 99049 and

10095, 1,866 and 1,871 PHI genes were identified, respectively. Two genes that were not shared with any of the other sequenced isolates were identified in isolate 99049, but none were observed in isolate 10095. Among the M. majus isolates, there were two genes that were found in both

128

predicted gene sets that were not found in any of the other Microdochium predicted gene sets.

Among the 451 PHI genes that were not found in any of the Microdochium predicted gene sets,

148 of these were bacterial genes, 19 were from Oomycetes, and 87 were from Basidiomycetes; the remaining "missing" genes were all from members of the Ascomycota.

Because the Microdochium species examined shared a nearly-identical set of putative

PHI genes, the relative number of hits for each of the PHI genes was also assessed to determine whether copy number differences could be detected. For 1,736 of the PHI genes, the number of putative copies (within the restrictions imposed on the search) found in all six of the

Microdochium genomes was identical. For the remaining PHI genes, differences were observed in the number of putative matches found in the predicted gene sets. The number of matches in the M. nivale and M. majus genomes was nearly identical for the majority of PHI genes identified. The total number of matches in each genome differed by fewer than ten for all of the

PHI genes assessed, with the exception of a single PHI gene which had 28 copies in both the turf isolate M. nivale 11037 and in M. bolleyi 07020 and only seven matches in all four of the other genomes. In addition, there were 47 genes that had at least ten more matches in the M. bolleyi genome than in any of the other genomes, and 39 genes that had ten or fewer matches in the M. bolleyi genome relative to any of the others. The function of many of the PHI genes that yielded these discrepancies were unknown; however, Table 3.9 lists PHI genes with differing copy numbers in the Microdochium genomes for which putative functions were available.

3.3.5 Identification of putative transposable elements and comparison to PHI genes A total of 24 transposable element sequences representing the major groups of TEs found in filamentous ascomycetes were downloaded from GenBank, and were used to query the

129

Microdochium spp. genomes. Of the 24 TE query sequences, four did not have a match in any of the Microdochium genomes. These four sequences included the Class II MITE-like sequences guest from N. crassa and mimp from F. oxysporum, the mutator-like sequence from N. parvum, and the Class I gypsy mars integrase from A. immersus. Both of the M. majus isolates examined possessed matches for the same 17 of 24 TE sequences queried. Among the M. nivale isolates examined, each isolate displayed a slightly different pattern of which TEs were found. The turf isolate 11037 possessed matches to only 13 of the 24 TE sequences, whereas the wheat isolates

12262 and 10106 matched 15 and 16 isolates, respectively. One TE sequence, the Class I skippy gag sequence from F. oxysporum, possessed at least one match in all of the genomes except the two M. nivale wheat isolates. The M. bolleyi isolate 07020 possessed at least one match to 19 of the 24 TE sequences, and one TE had a match in M. bolleyi only. Within these categories, the total number of putative TEs identified in each species is listed in Table 3.12. When the positions of the putative TE sequences were compared to those of the PHI genes, over 50% of the putative

TEs were found to be within 5kb of a PHI gene.

3.3.6 Identification of a putative EF-1α sequence in the genome of M. bolleyi When the M. nivale and M. majus partial EF-1α nucleotide sequences obtained in Chapter

2 were queried against the M. bolleyi 07020 predicted protein set, the top match for all of the sequences was predicted protein g6339. When the primers that had been used to amplify the partial EF-1α sequences from M. nivale and M. majus were aligned with the nucleotide sequence of this predicted gene, the reverse complement of the reverse primer was found to be a perfect match to the predicted gene sequence; however, both of the forward primer sequences had several mismatches with the predicted gene sequence. More importantly, the predicted binding

130

site of the reverse primer was upstream relative to the predicted binding sites of both forward primers. The alignment of these sequences is found in Appendix 3.14.

3.4 Discussion

Next-generation sequencing technology was used to obtain de novo genome sequences for two isolates of M. majus, three isolates of M. nivale, and one isolate of M. bolleyi. The isolates were sent for sequencing at separate times, with the genome of 99049 obtained first, followed by 11037, then 12262, and finally 10096, 10106, and 07020 all at the same time. The raw sequencing reads were successfully assembled into scaffolds, which were used to predict protein sequences from all isolates. The assembled genomes and predicted proteins were then used to perform comparisons between these sequenced isolates.

For M. majus isolate 99049 only, two separate rounds of genome assembly were conducted. In the first round, the raw sequencing reads were used as the input for the assembly programs, whereas in the second assembly round, the contigs obtained from the first assembly were used as the input. This second assembly significantly increased the size of the scaffolds obtained, from an N50 of 4,106 to 96,968. This large increase in contig size increased the utility of the genome data by increasing the number of predicted genes on each contig. This was particularly useful in MAT gene identification and synteny studies (Chapter 5).

For the remaining genome sequences, periodic re-assembly of the genomes using the same methodology and / or using newer versions of the assembly programs as they were released yielded only modest improvements in the overall assembly quality based on the N50 score. Two factors are likely to have contributed to the poor quality of the initial assembly of 99049. First, the genome sequencing data for 99049 was obtained approximately six months earlier than any 131

of the other data, and was generated using a read length of only 75 bp with an insert size of 200 bp, whereas all of the other assemblies had a read length of 100 bp with an insert size of at least

300 bp. This difference reflects the rapid improvement of sequencing technology and methodology within this short time frame. Second, while the sequencing of 99049 was conducted in Vancouver, British Columbia, the remaining isolates were sequenced in Montréal,

Québec. The relative proximity of the Montréal sequencing facility decreased the time that the samples spent in transit, thus minimizing the risk of unfavourable conditions such as warm temperatures, which may have decreased the overall quality of the DNA of 99049. The comparatively poor quality of the sequencing data from 99049 relative to the other M. nivale and

M. majus sequencing data, in addition to similar results obtained for other fungal genomes sequenced at these two facilities within the Hsiang lab (personal communication), led to the use of the Québec sequencing facility for the other genomes sequenced after 99049.

Aside from the difficulties discussed in assembling the genome of 99049, the other

Microdochium genomes were readily assembled using SOAPdenovo and ABySS. The N50 values for the final assemblies used in subsequent comparisons ranged between 96 kb (99049) to

371 kb (10106) for the M. nivale and M. majus data. The N50 obtained for the M. bolleyi genome was over 2 Mb in length; this is over five times greater than that obtained for any of the other

Microdochium genomes. This difference could not be readily explained; the approximate quantity and concentration of the M. bolleyi DNA was similar to that of the other Microdochium

DNA sequenced at the same time, and the quality of this assembly also exceeded that of any of the other filamentous Ascomycetes sequenced in the Hsiang lab (unpublished results), including genomes that have been obtained after the M. bolleyi data. This unexpected result is the subject of active research. 132

A nearly-identical number of predicted genes were obtained from both of the M. majus isolates sequenced, despite the fact that the second genome obtained (10095) had over 50% greater sequencing coverage than the 99049 genome. A similar trend was observed between the two M. nivale wheat genomes, 12262 and 10106, where the genome of 12262 had over 150% greater sequencing coverage than 10106, yet the 10106 genome yielded slightly more predicted genes. In addition, a similar number of predicted genes were obtained for all of the M. nivale and

M. majus genomes sequenced. By comparison, there were nearly 1,000 more predicted genes in

M. bolleyi relative to the M. nivale isolate with the highest number of predicted genes, whereas within the M. nivale and M. majus genomes, the difference in number between the isolate with the highest (11037) and the lowest (99049) number of predicted genes was just over 700. These similarities between and among the M. nivale and M. majus predicted gene sets suggest that, although there are likely protein-coding sequences that were not detected by AUGUSTUS (e.g. sequences that may have been interrupted by scaffold boundaries, or that received poor sequencing coverage simply due to random chance), the number of protein-coding genes among these species was similar. In addition, the fully sequenced and annotated genomes of over 300 fungi are available (Choi et al. 2013), including several phytopathogenic ascomycetes. Among these pathogenic ascomycetes, the number of predicted genes range between 5,854 (for Blumeria graminis) to 17,735 (in Fusarium oxysporum) (Raffaele and Kamoun 2012). The number of predicted genes for all of the Microdochium genomes sequenced in these experiments is well within this range. Together, these results suggest that the predicted gene sets obtained for the genomes sequenced were likely reasonably complete.

Using the genomes of M. nivale and M. majus, several sets of primers were designed with the goal of developing a method to rapidly distinguish between M. nivale and M. majus by 133

amplifying a single band of a different size for each species. A total of seven primer sets were designed and tested towards this goal. Despite using the genomic data of M. majus isolate 99049 and M. nivale isolate 11037, these primers proved difficult to design and optimize. In most cases, the primers failed to yield results that differed in a consistent manner between the two species.

Instead, multiple bands were often produced, which for most primers rendered the results for the two species indistinguishable. This discrepancy may have been caused because the primers were intentionally designed to span highly diverse regions (i.e. introns), and these regions may have differed in length between individuals, rather than between species as was originally intended.

To assess the variation both within and between these species, reciprocal genome-vs- genome tBLASTn searches (predicted genes vs scaffold sequences) were conducted. The results of these comparisons were assessed at three different e-value stringencies, with the goal of assessing the similarities and differences among and between the predicted genes of these species. At the highest stringency level (i.e. the smallest e-value), a smaller overall number of shared genes was expected, but the proportion of genes that were shared between the more closely-related species (e.g. within the M. majus genomes) was expected to grow. At the least stringent e-values, the overall number of genes shared between the genomes studied was expected to grow. When these comparisons were conducted, the differences between the three stringency levels were smaller than expected. The same general trends were observed between all three stringency levels.

The tBLASTn algorithm was chosen for this analysis for two primary reasons. First, we were interested in identifying homologs of putative protein-coding sequences, rather than assessing the overall similarity of the genome, including non-coding regions. Second, by choosing tBLASTn, the predicted gene sets were compared to the whole-genome nucleotide 134

sequences of each sequenced isolate. This facilitated the identification of protein-like sequences that were not included in the predicted protein set; this exclusion could occur for sequences that are truncated or otherwise non-functional homologs of sequences that may have functional analogs in the other isolates, or could simply be the result of an "oversight" by AUGUSTUS.

Among the M. nivale and M. majus isolates examined, an unexpected trend was observed regarding the dissimilarity of the M. nivale turf isolate 11037 to the two M. nivale wheat isolates also sequenced. Despite originating from North America and Europe, respectively, the two M. nivale wheat isolates 12262 and 10106 shared 97.9 and 95.6%, respectively, of their predicted protein sequences with each other at an e-value cutoff of 1e-05. By comparison, each of these isolates shared an average of 95.5 and 92.9%, respectively, of their predicted genes with the two sequenced M. majus isolates, and 93.1 and 90.1%, respectively, of their genes with M. nivale turf isolate 11037. Similarly, at an e-value cutoff of 1e-05, the turf isolate shared only slightly more of its predicted genes with the two M. nivale wheat isolates (92.2% average) compared to the two

M. majus isolates (91.6% average). Very broadly, this isolate was thus found to be approximately equally dissimilar to M. nivale from wheat and to M. majus (but was still more similar to either of these groups than to M. bolleyi). This trend is tentatively in agreement with earlier assertions that M. nivale possesses a relatively high level of intra-specific variation and that there may be distinct sub-groups within this population (Lees et al. 1995). Similarly, because the other four M. nivale and M. majus strains examined were all originally isolated from wheat, this shared host plant origin may at least partially explain their relative similarity.

To explore this observation further and to visualize the relationships between the sequenced Microdochium genomes, the concatenated sequences of ten predicted proteins that were found in the genomes of all Microdochium species surveyed, but which were also not found 135

among any of the non-Microdochium genomes, were used to construct bootstrapped neighbour- joining, maximum likelihood, and maximum parsimony trees. All three trees were rooted with

M. bolleyi, and yielded the same broad conclusions: both the M. nivale wheat isolates and the M. majus isolates each formed single clades with 100% bootstrap support. Surprisingly, in both the

NJ and ML trees, the M. nivale turf isolate 11037 grouped with the node containing the M. majus isolates with strong bootstrap support. In the MP tree, the M. nivale 11037 isolate formed a trichotomy with M. bolleyi and with the clade containing the remaining isolates.

Although the relative dissimilarity between the sequenced turf isolate of M. nivale and the wheat isolates is an unexpected an interesting observation, it is not possible to draw broad conclusions regarding the overall similarity (or lack thereof) between all M. nivale turf isolates compared to those from wheat based on the observations of only five genomes (only one of which was from turf). Instead, this finding demonstrates the importance of future work to investigate this dissimilarity. At least one European and one additional North American M. nivale turf isolate should be included in this analysis before attempting to form a more general hypothesis regarding the apparent importance of host origin in M. nivale.

In both the whole-genome vs. genome BLAST comparison and the Microdochium- specific gene tree, M. bolleyi was found to be more dissimilar to the other Microdochium spp. genomes than any of the other genomes were to one another. This was easily rationalized, as M. nivale and M. majus were considered to be a single species until 2005 (Glynn et al.). In addition,

M. bolleyi is primarily associated with the roots of graminaceous species (Murray and Gadd

1981), whereas M. majus and M. nivale attack the stem and grain of grasses and cereals (Parry

1990).

136

Differences between M. bolleyi and the other Microdochium genomes examined were also detected among the number of matches in the predicted gene sets of these genomes against the genes from the pathogen-host interaction (PHI) database. Although the identities of the PHI genes found among these genomes were similar, the relative number of matches among the genomes was more variable, especially between M. bolleyi relative to the other Microdochium genomes. A total of 86 PHI genes had at least ten more or at least ten fewer matches in M. bolleyi than in any other Microdochium genome examined. Although the putative functions of the majority of these genes were not available based on their GenBank summaries, their inclusion in the PHI database clearly suggests that they may play a role in the pathogenicity of these species. One possible explanation for this difference is that M. bolleyi's grows on or inside roots (Murray and Gadd 1981), rather than in above-ground plant parts as do M. nivale and M. majus (Parry 1990). In addition, M. bolleyi is described as a "weak" root pathogen (Murray and

Gadd 1981), and has been found in non-damaging interactions with plant roots (Domsch and

Gams 1972).

Putative transposable elements from both classes were identified in the draft genome sequences of all of the Microdochium spp. isolates examined. Representatives of nine out of the ten TE superfamilies were identified in the Microdochium genomes, with the exception of the

Class II MITE-like elements. In addition to identifying TEs, the positions of these sequences relative to the PHI genes were identified to investigate the hypothesis that pathogenicity-related genes are often associated with TEs (e.g. (Amyotte et al. 2012; Hatta et al. 2002)). Of the putative TEs identified in the Microdochium genomes, more than 50% of the sequences were found within 5 kb of a PHI gene, suggesting that Microdochium follows this general trend.

137

This chapter describes the generation of whole-genome sequencing data for a total of six

Microdochium strains, consisting of two isolates of M. majus, two isolates of M. nivale from wheat, one isolate of M. nivale from turfgrass, and one isolate of M. bolleyi. All of these

Microdochium genomes were found to share the majority of their predicted gene sets, including the subset of predicted genes that were predicted to play a role in pathogenicity. Despite originating from different continents, the two M. majus genomes examined were found to share over 99% of their predicted genes. A similar trend was observed for the two M. nivale wheat isolates, but, surprisingly, the one M. nivale turf isolate examined was approximately equally dissimilar to the M. nivale wheat and the M. majus genomes. This difference was supported by both the total number of predicted genes shared between the genome and by building phylogenetic trees based on the alignments of genes that were found only among the

Microdochium genomes. However, the relative number of matches to genes in a database of pathogen-related genes was found to be nearly identical between all of the M. nivale and M. majus, whereas this similarity was not shared with the predicted genes of M. bolleyi. The whole- genome information prepared in this chapter was used to inform all future research including, most significantly, the search for mating-type genes in these species (Chapter 5).

138

3.5 References for Chapter 3

Alexopoulos, C.J., Mims, C.W., and Blackwell, M. 1996. Introductory Mycology. Alkan, C., Sajjadian, S., and Eichler, E.E. 2011. Limitations of next-generation genome sequence assembly. Nature Methods 8(1): 61-65. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. Journal of Molecular Biology 215: 403-410. Amselem, J., Cuomo, C.A., van Kan, J.A.L., Viaud, M., Benito, E.P., Couloux, A., Coutinho, P.M., de Vries, R.P., Dyer, P.S., Fillinger, S., Fournier, E., Gout, L., Hahn, M., Kohn, L., Lapalu, N., Plummer, K.M., Pradier, J.-M., Quevillon, E., Sharon, A., Simon, A., ten Have, A., Tudzynski, B., Tudzynski, P., Wincker, P., Andrew, M., Anthouard, V., Beever, R.E., Beffa, R., Benoit, I., Bouzid, O., Brault, B., Chen, Z., Choquer, M., Collemare, J., Cotton, P., Danchin, E.G., Da Silva, C., Gautier, A., Giraud, C., Giraud, T., Gonzalez, C., Grossetete, S., Guldener, U., Henrissat, B., Howlett, B.J., Kodira, C., Kretschmer, M., Lappartient, A., Leroch, M., Levis, C., Mauceli, E., Neuveglise, C., Oeser, B., Pearson, M., Poulain, J., Poussereau, N., Quesneville, H., Rascle, C., Schumacher, J., Segurens, B., Sexton, A., Silva, E., Sirven, C., Soanes, D.M., Talbot, N.J., Templeton, M., Yandava, C., Yarden, O., Zeng, Q., Rollins, J.A., Lebrun, M.-H., and Dickman, M. 2011. Genomic analysis of the necrotrophic fungal pathogens Sclerotinia sclerotiorum and Botrytis cinerea PLos Genetics 7(8): e1002230. Amyotte, S.G., Tan, X., Pennerman, K., Jimenez-Gasco, M.d.M., Klosterman, S.J., Ma, L.-J., Dobinson, K.F., and Veronese, P. 2012. Transposable elements in phytopathogenic Verticicllium spp.: insights into genome evolution and inter- and intra-specific diversification. BMC Genomics 13: 314-333. Ansorge, W., Sproat, B.S., Stegemann, J., and Schwager, C. 1986. A non-radioctive automated method for DNA sequence determination. Journal of Biochemical and Biophysical Methods 13(6): 315-323. Attwood, T.K., Gisel, A., Eriksson, N.-E., and Bongcam-Rudloff, E. 2011. Concepts, Historical Milestones and the Central Place of Bioinformatics in Modern Biology: A European Perspective. In Bioinformatics - Trends and Methodologies. Edited by M.A. Mahdavi. InTech. Borodovsky, M., and Lukashin, A.V. 1998. GeneMark.hmm: new solutions for gene finding. Nucleic Acids Research 26(4): 1107-1115. Chang, D., and Duda, T.F.J. 2012. Extensive and continuous duplication facilitates rapid evolution and diversification of gene families. Molecular Biology and Evolution 29(8): 2019-2029. Chen, T.-W., Gan, R.-C.R., Wu, T.H., Huang, P.-J., Lee, C.-Y., Chen, Y.-Y.M., Chen, C.-C., and Tang, P. 2012. FastAnnotator - an efficient transcript annotation web tool. BMC Genomics 13(Suppl 7): S9. Choi, J., Cheong, K., Jung, K., Jeon, J., Lee, G.-W., Kang, S., Kim, S., Lee, Y.-W., and Lee, Y.- H. 2013. CFGP 2.0: a versatile web-based platform for supporting comparative and evolutionary genomics of fungi and Oomycetes. Nucleic Acids Research 41: D714-D719. Convert, S.F., Enkerli, J., Miao, V.P., and VanEtten, H.D. 1996. A gene for maackiain detoxification from a dispensable chromosome of Nectria haematococca. Molecular and General Genetics 251(4): 397-406. 139

Cui, W., Beever, R.E., Parkes, S.L., Weeds, P.L., and Templeton, M.D. 2002. An osmosensing histidine kinase mediates dicarboximide fungicide resistance in Botryotinia fuckeliana (Botrytis cinerea). Fungal Genetics and Biology 36(3): 187-198. Daboussi, M.-J., and Capy, P. 2003. Transposable elements in filamentous fungi. Annual Review of Microbiology 57: 275-299. Daboussi, M.-J., Langin, T., and Brygoo, Y. 1992. Fot1, a new family of fungal transposable elements. Molecular and General Genetics 232: 12-16. Darling, A.C.E., Mau, B., Blattner, F.R., and Perna, N.T. 2004. Mauve: multiple alignment of converved genomic sequence with rearrangements. Genome Research 14: 1394-1403. Dean, R.A., Talbot, N.J., Ebbole, D.J., Farman, M.L., Mitchell, T.K., Orbach, M.J., Thon, M., Kulkarni, R., Xu, J.-R., Pan, H., Read, N.D., Lee, Y.-H., Carbone, I., Brown, D., Oh, Y.Y., Donofrio, N., Jeong, J.S., Soanes, D.M., Djonovic, S., Kolomiets, E., Rehmeyer, C., Li, W., Harding, M., Kim, S., Lebrun, M.-H., Bohnert, H., Coughlan, S., Butler, J., Calvo, S., Ma, L.-J., Nicol, R., Purcell, S., Nusbaum, C., Galagan, J.E., and Birren, B.W. 2005. The genome sequence of the rice blast fungus Magnaporthe grisea. Nature 434: 980-986. Domsch, K.H., and Gams, W. 1972. Fungi in Agricultural . Longman, London. Eddy, S.R. 2004. What is a hidden Markov model? Nature Biotechnology 22(10): 1315-1316. Elías-Villalobos, A., Fernández-Álvarez, A., and Ibeas, J.I. 2011. The general transcriptional repressor Tup1 is required for dimorphism and virulence in a fungal plant pathogen. PLOS Pathogens 7(9): e1002235. Fullwood, M.J., Wei, C.-L., Liu, E.T., and Ruan, Y. 2009. Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. Genome Research 19: 521-532. Galagan, J.E., Calvo, S.E., Cuomo, C.A., Ma, L.-J., Wortman, J.R., Batzoglou, S., Lee, S.-I., Basturkmen, M., Spevak, C.C., Clutterbuck, J., Kapitonov, V., Jurka, J., Scazzocchio, C., Farman, M., Butler, J., Purcell, S., Harris, S., Braus, G.H., Draht, O., Busch, S., D'Enfert, C., Bouchier, C., Godman, G.H., Bell-Pedersen, D., Griffiths-Jones, S., Doonan, J.H., Yu, J., Vienken, K., Pain, A., Freitag, M., Selker, E.U., Archer, D.B., Penalva, M.A., Oakley, B.R., Momany, M., Tanaka, T., Kumagai, T., Asai, K., Machida, M., Nierman, W.C., Denning, D.W., Caddick, M., Hynes, M., Paoletti, M., Fischer, R., Miller, B., Dyer, P., Sachs, M.S., Osmani, S.A., and Birren, B.W. 2005. Sequencing of Aspergillus nidulans and comparative analysis with A. fumigatus and A. oryzae. Nature 438: 1105- 1115. Gao, Q., Jin, K., Ying, S.-H., Zhang, Y., Xiao, G., Shang, Y., Duan, Z., Hu, X., Xie, X.-Q., Zhou, G., Peng, G., Luo, Z., Huang, W., Wang, B., Fang, W., Wang, S., Zhong, Y., Ma, L.-J., St. Leger, R.J., Zhao, G.-P., Pei, Y., Feng, M.-G., Xia, Y., and Wang, C. 2011. Genome sequencing and comparative transcriptomics of the model enthomopathogenic fungi Metarhizium anisopilae and M. acridium. PLoS Genetics 7(1): e1001264. GenBank. 2012. NCBI-Genbank Flat File Release 193.0 - Distribution Release Notes. National Library of Medicine, Bethesda, MD. Glynn, N.C., Hare, M.C., Parry, D.W., and Edwards, S.G. 2005. Phylogenetic analysis of EF-1 alpha gene sequences from isolates of Microdochium nivale leads to elevation of varieties majus and nivale to species status. Mycological Research 109: 872-880.

140

Goffeau, A., Barrell, B.G., Bussey, H., Davis, R.W., Dujon, B., Feldmann, H., Galibert, F., Hoheisel, J.D., Jacq, C., Johnston, M., Louis, E.J., Mewes, H.W., Murakami, Y., Philippsen, P., Tettelin, H., and Oliver, S.G. 1996. Life with 6000 Genes. Science 274(5287): 546-567. Gregory, T.R., Nicol, J.A., Tamm, H., Kullman, B., Kullman, K., Leitch, I.J., Murray, B.G., Kapraun, D.F., Greilhuber, J., and Bennett, M.D. 2007. Eukaryotic genome size databases. Nucleic Acids Research 35: D332-D338. Haridas, S., Breuill, C., Bohlmann, J., and Hsiang, T. 2011. A biologist's guide to de novo genome assembly using next-generation sequence data: A test with fungal genomes. Journal of Microbiological Methods 86: 368-375. Hartl, D.L., and Clark, A.G. 2007. Principles of Population Genetics, Fourth Edition. Sinauer Associates, Sunderland, Massachusetts. Hatta, R., Ito, K., Hosaki, Y., Tanaka, T., Tanaka, A., Yamamoto, M., Akimitsu, K., and Tsuge, T. 2002. A conditionally dispensable chromosome controls host-specific pathogenicity in the fungal plant pathogen Alternaria alternaria. Genetics 161: 59-70. Hillier, L.W., Marth, G.T., Quinlan, A.R., Dooling, D., Fewell, G., Barnett, D., Fox, P., Glasscock, J.I., Hickenbotham, M., Huang, W., Magrini, V.J., Richt, R.J., Sander, S.N., Stewart, D.A., Stromberg, M., Tsung, E.F., Wylie, T., Schedl, T., Wilson, R.K., and Mardis, E.R. 2008. Whole-genome sequencing and variant discovery in C. elegans. Nature Methods 5: 183-188. Jackson, A.P., Gamble, J.A., Yeomans, T., Moran, G.P., Saunders, D., Harris, D., Anslett, M., Barrell, J.F., Butler, G., Citiulo, F., Coleman, D.C., de Groot, P.W.J., Goodwin, T.J., Quail, M.A., McQuillan, J., Munro, C.A., Pain, A., Poulter, R.T., Rajandream, M.-A., Renauld, H., Spiering, M.J., Tivey, A., Gow, N.A.R., Barrell, B., Sullivan, D.J., and Berriman, M. 2009. Comparative genomics of the fungal pathogens Candida dubliniensis and Candida albicans. Genome Research 19(12): 2231-2244. Kingsbury, J.M., Yang, Z., Ganous, T.M., Cox, G.M., and McCusker, J.H. 2004. Novel chimeric spermidine synthase-saccharopine dehydrogenase gene (SPE3-LYS9) in the human pathogen Cryptococcus neoformans. Eukaryotic Cell 3(3): 752-763. Lee, Y.-J., Yamamoto, K., Hamamoto, H., Nakaune, R., and Hibi, T. 2005. A novel ABC transporter gene ABC2 involved in multidrug susceptibility but not pathogenicity in rice blast fungus, Magnaporthe grisea. Pesticide Biochemistry and Physiology 81(1): 13-23. Lees, A.K., Nicholson, P., Rezanoor, H.N., and Parry, D.W. 1995. Analysis of variation within Microdochium nivale from wheat - evidence for a distinct subgroup. Mycological Research 99: 103-109. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and Subgroup, G.P.D.P. 2009a. The sequence alignment / map format and SAMtools. Bioinformatics 25(16): 2078-2079. Li, R., Fan, W., Tian, G., Zhu, H., He, L., Cai, J., Huang, Q., Cai, Q., Li, B., Bai, Y., Zhang, Z., Zhang, Y., Wang, W., Li, J., Wei, F., Li, H., Jian, M., Li, J., Zhang, Z., Nielsen, R., Li, D., Gu, W., Yang, Z., Xuan, Z., Ryder, O.A., Leung, F.C.-C., Zhou, Y., Cao, J., Sun, X., Fu, Y., Fang, X., Guo, X., Wang, B., Hou, R., Shen, F., Mu, B., Ni, P., Lin, R., Qian, W., Wang, G., Yu, C., Nie, W., Wang, J., Wu, Z., Liang, H., Min, J., Wu, Q., Cheng, S., Ruan, J., Wang, M., Shi, Z., Wen, M., Liu, B., Ren, X., Zheng, H., Dong, D., Cook, K., Shan, G., Zhang, H., Kosiol, C., Xie, X., Lu, Z., Zheng, H., Li, Y., Steiner, C.C., Lam, 141

T.T.-Y., Lin, S., Zhang, Q., Li, G., Tian, J., Gong, T., Liu, H., Zhang, D., Fang, L., Ye, C., Zhang, J., Hu, W., Xu, A., Ren, Y., Zhang, G., Bruford, M.W., Li, Q., Ma, L., Guo, Y., An, N., Hu, Y., Zheng, Y., Shi, Y., Li, Z., Liu, Q., Chen, Y., Zhao, J., Qu, N., Zhao, S., Tian, F., Wang, X., Wang, H., Xu, L., Liu, X., Vinar, T., Wang, Y., Lam, T.-W., Yiu, S.-M., Liu, S., Zhang, H., Li, D., Huang, Y., Wang, X., Yang, G., Jiang, Z., Wang, J., Qin, N., Li, L., Li, J., Bolund, L., Kristiansen, K., Wong, G.K.-S., Olson, M., Zhang, X., Li, S., Yang, H., Wang, J., and Wang, J. 2009b. The sequence and de novo assembly of the giant panda genome. Nature 463(7279): 311-317. Li, R., Zhu, H., Ruan, J., Qian, W., Fang, X., Shi, Z., Li, Y., Li, S., Shan, G., Kristiansen, K., Li, S., Yang, H., Wang, J., and Wang, J. 2009c. De novo assembly of human genomes with massively parallel short read sequencing. Genome Research 20: 265-272. Li, R.Q., Li, Y.R., Kristiansen, K., and Wang, J. 2008. SOAP: short oligonucleotide alignment program. Bioinformatics 24(5): 713-714. Lister, R., Gregory, B.D., and Ecker, J.R. 2009. Next is now: new technologies for sequencing of genomes, transcriptomes, and beyond. Current Opinion in Plant Biology 12: 107-118. Liu, L., Li, Y., Li, S., Hu, N., He, Y., Pong, R., Lin, D., Lu, L., and Law, M. 2012. Comparison of next-generation sequencing systems. Journal of Biomedicine and Biotechnology 2012: Article ID 251364. Ma, L.-J., van der Does, H.C., Borkovich, K.A., Coleman, J.J., Daboussi, M.-J., Di Pietro, A., Dufresne, M., Freitag, M., Grabherr, M.G., Henrissat, B., Houterman, P.M., Kang, S., Shim, W.-B., Woloshuk, C., Xie, X., Xu, J.-R., Antoniw, J., Baker, S.E., Bluhm, B.H., Breakspear, A., Brown, D.W., Butchko, R.A.E., Chapman, S., Coulson, R., Coutinho, P.M., Danchin, E.G.J., Diener, A., Gale, L.R., Gardiner, D.M., Goff, S., Hammond- Kosack, K.E., Hilburn, K., Hua-Van, A., Jonkers, W., Kazan, K., Kodira, C.D., Koehrsen, M., Kumar, L., Lee, Y.-H., Li, L., Manners, J.M., Miranda-Saavedra, D., Mukherjee, M., Park, G., Park, J., Park, S.-Y., Proctor, R.H., Regev, A., Ruiz-Roldan, M.C., Sain, D., Sakthikumar, S., Sykes, S., Schwartz, D.C., Turgeon, B.G., Wapinski, I., Yoder, O., Young, S., Zeng, Q., Zhou, S., Galagan, J.E., Cuomo, C.A., Kistler, H.C., and Rep, M. 2010. Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium. Nature 464: 367-373. Mardis, E.R. 2008. Next-generation DNA sequencing methods. Annual Review of Genomics and Human Genetics 9: 387-402. Methé, B.A., Nelson, K.E., Deming, J.W., Momen, B., Melamud, E., Zhang, X., Moult, J., Madupu, R., Nelson, W.C., Dodson, R.J., Brinkac, L.M., Daugherty, S.C., Durkin, A.S., DeBoy, R.T., Kolonay, J.F., Sullivan, S.A., Zhou, L., Davidsen, T.M., Wu, M., Huston, A.L., Lewis, M., Weaver, B., Weidman, J.F., Khouri, H., Utterback, T.R., Feldblyum, T.V., and Fraser*, C.M. 2005. The psychrophilic lifestyle as revealed by the genome sequence of Colwellia psychrerythraea 34H through genomic and proteomic analyses. Proceedings of the National Academy of Sciences of the U.S.A. 102(31): 10913-10918. Metzker, M.L. 2010. Sequencing technologies: the next generation. Nature Reviews Genetics 11: 31-36. Murray, D.I.L., and Gadd, G.M. 1981. Preliminary studies on Microdochium bolleyi with special reference to colonization of barley. Transactions of the British Mycological Society 76(3): 397-403.

142

Nelson, D.L., and Cox, M.M. 2004. Lehninger Principles of Biochemistry, 4th edition. W.H. Freeman, New York. Nielsen, R., Paul, J.S., Albrechtsen, A., and Song, Y.S. 2011. Genotype and SNP calling from next-generation sequencing data. Nature Reviews Genetics 12(6): 443-451. Nowrousian, M., Stajich, J.E., Chu, M., Engh, I., Espagne, E., Halliday, K., Kamerewerd, J., Kempken, F., Knab, B., Kuo, H.-C., Osiewacz, H.D., Poeggeler, S., Read, N.D., Seiler, S., Smith, K.M., Zickler, D., Kück, U., and Freitag, M. 2010. De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora, a Model Organism for Fungal Morphogenesis. PLoS Genetics 6(4): e1000891. O’Connell, R.J., Thon, M.R., Hacquard, S., Amyotte, S.G., Kleemann, J., Torres, M.F., Damm, U., Buiate, E.A., Epstein, L., Alkan, N., Altmüller, J., Alvarado-Balderrama, L., Bauser, C.A., Becker, C., Birren, B.W., Chen, Z., Choi, J., Crouch, J.A., Duvick, J.P., Farman, M.A., Gan, P., Heiman, D., Henrissat, B., Howard, R.J., Kabbage, M., Koch, C., Kracher, B., Kubo, Y., Law, A.D., Lebrun, M.-H., Lee, Y.-H., Miyara, I., Moore, N., Neumann, U., Nordström, K., Panaccione, D.G., Panstruga, R., Place, M., Proctor, R.H., Prusky, D., Rech, G., Reinhardt, R., Rollins, J.A., Rounsley, S., Schardl, C.L., Schwartz, D.C., Shenoy, N., Shirasu, K., Sikhakolli, U.R., Stüber, K., Sukno, S.A., Sweigard, J.A., Takano, Y., Takahara, H., Trail, F., van der Does, H.C., Voll, L.M., Will, I., Young, S., Zeng, Q., Zhang, J., Zhou, S., Dickman, M.B., Schulze-Lefert, P., van Themaat, E.V.L., Ma, L.-J., and Vaillancourt, L.J. 2012. Lifestyle transitions in plant pathogenic Colletotrichum fungi deciphered by genome and transcriptome analyses. Nature Genetics 44(9): 1060-1067. Ossowski, S., Schneeberger, K., Clark, R., Lanz, C., Warthmann, N., and Weigel, D. 2008. Sequencing of natural strains of Arabidopsis thaliana with short reads. Genome Research 18: 2024-2033. Parry, D.W. 1990. The incidence of Fusarium spp. in stem bases of selected crops of winter wheat in The Midlands, UK. Plant Pathology 39(4): 619-622. Pop, M., and Salzberg, S.L. 2008. Bioinformatics challenges of new sequencing technology. Trends in Genetics 24(3): 142-149. Punta, M., Coggill, P.C., Eberhardt, R.Y., Mistry, J., Tate, J., Boursnell, C., Pang, N., Forslund, K., Ceric, G., Clements, J., Heger, A., Holm, L., Sonnhammer, E.L.L., Eddy, S.R., Bateman, A., and Finn, R.D. 2012. The Pfam protein families database. Nucleic Acids Research Database Issue 40(D290-D301). Quail, M.A., Smith, M., Coupland, P., Otto, T.D., Harris, S.R., Connor, T.R., Bertoni, A., Swerdlow, H.P., and Gu, Y. 2012. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 13: 341-343. Raffaele, S., and Kamoun, S. 2012. Genome evolution in filamentous plant pathogens: why bigger can be better. Nature Reviews Microbiology 10(6): 417-430. Roche Diagnostics Corporation 2013. Products - GS-FLX+ system. Available from http://454.com/products/gs-flx-system/ [cited 23 July 2013]. Sanger, F., Nicklen, S., and Coulson, A.R. 1977. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences of the U.S.A. 74(12): 5463- 5467.

143

Schirawski, J., Mannhaupt, G., Munch, K., Brefort, T., Schipper, K., Doehlemann, G., Di Stasio, M., Rossel, N., Mendoza-Mendoza, A., Pester, D., Muller, O., Winterberg, B., Meyer, E., Ghareeb, H., Wollenberg, T., Munsterkotter, M., Wong, P., Walter, M., Stukenbrock, E., Guldener, U., and Kahmann, R. 2010. Pathogenicitiy determinants in smut fungi revealed by genome comparison. Science 330: 1546-1548. Sharpton, T.J., Stajich, J.E., Rounsley, S.D., Gardner, M.J., Wortman, J.R., Jordar, V.S., Maiti, R., Kodira, C., Neafsey, D.E., Zeng, Q., Hung, C.-Y., McMahan, C., Muszewska, A., Grynberg, M., Mandel, M.A., Kellner, E.M., Barker, B.M., Galgiani, J.N., Orbach, M.J., Kirkland, T.N., Cole, G.T., Henn, M.R., Birren, B.W., and Taylor, J.W. 2009. Comparative genomic analyses of the human fungal pathogens Coccidioides and their relatives. Genome Research 19: 1722-1731. Shendure, J., and Ji, H. 2008. Next-generation DNA sequencing. Nature Biotechnology 26(10): 1135-1145. Shim, W.B., and Dunkle, L.D. 2003. CZK3, a MAP kinase kinase kinase homolog in Cercospora zeae-maydis, regulates cercosporin biosynthesis, fungal development, and pathogenesis. Molecular Plant Microbe Interactions 16(9): 760-768. Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J., and Birol, I. 2009. ABySS: A parallel assembler for short read sequence data. Genome Research 19: 1117-1123. Sleator, R.D. 2010. An overview of the current status of eukaryote gene prediction strategies. Gene 461: 1-4. Stajich, J.E., Dietrich, F.S., and Roy, S.W. 2006. Comparative genomic analysis of fungal genomes revels intron-rich ancestors. Genome Biology 8: R223. Stamp, M. 2012. A revealing introduction to Hidden Markov Models [pdf]. Available from http://www.cs.sjsu.edu/~stamp/RUA/HMM.pdf [cited 4 June 2013]. Stanke, M. 2003. Gene prediction with a hidden Markov model, Mathematics and natural sciences, Georg-August University of Göttingen, Göttingen, Germany. Stanke, M., Steinkamp, R., Waack, S., and Morgenstern, B. 2004. AUGUSTUS: a web server for gene finding in eukaryotes. Nucleic Acids Research 32(2): W309-W312. Thon, M., Pan, H., Diener, S., Papalas, J., Taro, A., Mitchell, T., and Dean, R. 2006. The role of transposable element clusters in genome evolution and loss of synteny in the rice blast fungus Magnaporthe oryzae. Genome Biology 7: R16. Winnenburg, R., Baldwin, T.K., Urban, M., Rawlings, C., Kohler, J., and Hammond-Kosack, K.E. 2006. PHI base: a new database for pathogen host interactions. Nucleic Acids Research 34: D459-D464. Zerbino, D.R., and Birney, E. 2008. Velvet: Algorithms for de novo short read assembly using de Brujin graphs. Genome Research 18: 821-829.

144

Table 3.1 Summary of DNA quantity and sequencing facility utilized for genome sequencing of six Microdochium isolates.

Isolate Quantity of DNA Insert size Read length Location of Sequencing Species Sequencing Facility Number submitted (µg) (bp) (bp) Facility BC Genomics Sequencing 99049 1 200 75 Vancouver, BC M. majus Centre 10095 4 400 100 11037 3 300 100 Génome Québec M. nivale 12262 2 400 100 Montréal, QC Innovation Centre 10106 2 400 100 M. bolleyi 07020 4 400 100

145

Table 3.2 Sordariomycetes genomes included in whole- genome comparisons against

Microdochium spp.

Genome Order Family Genus Species Source Plectosphaerellaceae Acremonium alcalophilum JGI* Glomerellales Glomerellaceae Colletotrichum graminearum Hsiang lab

mitosporic dahliae JGI Hypocreales Verticillium Hypocreales albo-atrum Broad Institute Magnaporthe oryzae Broad Institute Sordariaceae Neurospora crassa Broad Institute Chaetomium globosum Broad Institute Sordariales Chaetomiaceae Sporotrichum thermophile JGI Thielavia terrestris JGI Annulohypoxylon stygium Hsiang lab Daldinia eschscholtzii Hsiang lab sp. (CO) JGI Xylariales Xylariaceae Hypoxylon sp. (EC) JGI nigrescens Pestalotiopsis Hsiang lab theae

* JGI = Joint Genome Institute

146

Table 3.3 Summary of genome assembly, protein prediction, and predicted gene annotation statistics for sequenced Microdochium genomes

Number of N50 of Number of Highest Genome Species Isolate Reads Coverage predicted predicted predicted genes N50 (bp) size (bp) genes genes annotated 99049 24,527,354 96,968 35,892,675 137 11,343 1,962 7,617 M. majus 10095 77,722,046 260,023 35,930,000 216 11,651 1,986 7,667 11037 52,250,482 139,319 37,010,607 282 12,060 1,956 8,080 M. nivale 12262 56,244,760 309,358 36,683,753 307 11,430 1,989 7,666 10106 44,321,898 371,751 37,085,241 119 11,714 1,992 7,669 M. bolleyi 07020 41,555,928 2,035,661 38,158,599 109 13,047 2,016 8,060

147

Table 3.4 Genome assembly statistics for M. majus assembly with Velvet, SOAPdenovo, and ABySS for odd-numbered kmers 29-59.

Note that gap-closing was not performed for this comparison.

SOAPdenovo ABySS Velvet kmer Number of Genome Number of Genome Number of Genome N50 N50 N50 Scaffolds size (Mb) Scaffolds size (Mb) Scaffolds size (Mb) 29 5,113 34.4 2,057 12,004 36.2 54,769 97,954 37.3 1,208 31 2,838 34.6 2,893 10,500 36.2 63,149 67,960 36.7 1,810 33 1,634 34.8 4,218 8,996 36.1 70,824 48,562 36.2 2,681 35 1,119 34.9 6,222 7,948 36.1 79,568 NA* 37 811 35.0 8,650 7,337 36.3 80,957 25,381 35.7 5,259 39 717 35.1 10,413 6,814 36.2 82,418 20,985 35.6 6,356 41 678 35.1 11,454 6,283 36.1 92,244 18,275 35.5 7,260 43 684 35.1 12,113 5,827 36.0 95,775 16,879 35.4 7,684 45 603 35.1 11,756 5,321 35.9 96,968 NA 47 621 35.1 11,208 5,024 36.2 94,931 NA 49 656 35.1 10,238 4,666 36.1 93,566 12,635 35.3 7,977 51 720 35.1 9,262 4,378 35.9 87,205 11,806 35.2 7,794 53 811 35.1 8,133 4,078 35.6 82,032 11,626 35.1 7,441 55 985 35.0 6,948 3,808 35.3 75,085 12,302 34.9 6,438 57 1,282 34.9 5,820 3,536 35.8 69,140 13,636 34.9 5,403 59 1,730 34.8 4,731 3,435 35.3 61,566 16,140 34.6 4,377 * Assembly at this kmer size did not complete

148

Table 3.5 Species-specific primers designed and tested with at least two isolates each of M. nivale and M. majus

Ta Predicted band size (bp) Observed band size(s) (bp) Primer Name Primer Sequence % GC (°C) M. nivale M. majus M. nivale M. majus ACCTTGCTGACGGGACC 57.3 65 1,100, Mm.Mn_impB_2340F 1,200 and TCGGCAGCACGAGCCT 962 1115 1,200 and Mm.Mn_impB3440R 55.9 69 2,000 1,800 CCGTGTTGCTCAAGCTG 54.9 59 Mm.Mn_AP2a_3180F 1,000 and 1,000 and 2107 1546 Mm_Mn_AP2a_4720R AGGTCTGTGCCAGTCG 53.5 63 1,500 1,500 Mm.Mn_Med5_3260F AGACGGACCCCTTGCTC 57.3 65 400 and 628 556 500 and 600 Mm.Mn_Med5_3800R AGCAAGTCGGCGCTGC 55.9 69 500 Mm.Mn_GH31_3684F GCTTCCCCGCACTTCCC 59.8 71 700 and 709 800 700 Mm.Mn_GH31_4483R TCCTCAACGTCCCGGCC 59.8 71 800* Mn.Mm_Mn371_3198F ACGTCGCACTGGACCG 55.9 69 603 458 400 450 Mn.Mm_Mn371_3801R TCAACGGCATSRTCGCCG 58.4 61 Mn.Mm_HSP70_1041F CTCAACATCCAGGGTGGTG 59.5 58 332 472 400* 500 Mn.Mm_HSP70_1373R GCGAGTCGATCTCGATGG 58.4 61 Mn.Mm_C6TF_53F TGTGGAACACGATCCTCG 56.3 56 1012 880 - 1,200 Mn.Mm_C6TF_1065sR GTCTCGCCGGACTCGA 55.9 69 * In one isolate only

149

Table 3.6 Comparisons between the predicted gene sequences from M. nivale, M. majus, and M. bolleyi against each other and against the genomes of 6 (non-Microdochium) members of the Xylariales and 9 non-Xylariales members of the Sordariomycetes.

Comparisons were performed using tBLASTn with an e-value cutoff of 1e-05.

Isolate* Mn11037 Mn12262 Mn10106 Mm99049 Mm10095 Mb07020 Total number of predicted genes 12,060 11,430 11,714 11,343 11,651 13,047 Total number of genes > 100 bp 11,792 11,206 11,224 11,083 11,159 12,519 Number of genes with no hits 397 10 10 16 9 1,543 Number of genes in all categories 9,551 9,103 9,073 9,110 9,041 9,352 Number of genes not found in the 1,378 1,685 1,753 1,583 1,720 220 NM†-NX‡-Sordariales Number of genes in NM-Xylariales 219 188 184 199 189 9,499 not also found in the Sordariales Number of genes in the Sordariales not 24 0 1 0 0 13 shared with any Microdochium Mn11037 NA§ 10,639 10,563 10,500 10,478 10,568 Total number of Mn12262 11,117 NA 11,197 10,815 10,857 10,482 genes shared with Mn10106 11,127 11,193 NA 10,818 10,854 10,481 specific genomes: Mm99049 11,045 10,918 10,884 NA 11,143 10,499 Mm10095 11,047 10,915 10,880 11,054 NA 10,505 wheat Mn NA 474 560 312 396 NC|| Total number of Turf Mn NA 0 4 27 35 NC genes found only M. bolleyi 51 1 1 3 1 NA in: All Mn only 94 161 207 1 1 NC All Mm only 65 1 3 114 140 NC 150

Isolate* Mn11037 Mn12262 Mn10106 Mm99049 Mm10095 Mb07020 Mn AND Mm 272 669 703 578 717 636 only

* Mn = M. nivale; Mm = M. majus; Mb = M. bolleyi † NX = Non-xylariales ‡ NS = Non-Sordariomycete § NA = Comparison not applicable || NC = Calculation was not performed

151

Table 3.7 Comparisons between the predicted gene sequences from M. nivale, M. majus, and M. bolleyi against each other and against the genomes of 6 (non-Microdochium) members of the Xylariales and 9 non-Xylariales members of the Sordariomycetes.

Comparisons were performed using tBLASTn with an e-value cutoff of 1e-20.

Isolate* Mn11037 Mn12262 Mn10106 Mm99049 Mm10095 Mb07020 Total number of predicted genes 12,060 11,430 11,714 11,343 11,651 13,047 Total number of genes > 100 bp 11,792 11,206 11,224 11,083 11,159 12,519 Number of genes with no hits 896 53 75 51 49 2,116 Number of genes in all categories 8,609 8,252 8,241 8,251 8,246 8,467 Number of genes not found in the 1,729 2,422 2,434 2,299 2,401 1,332 NM†-NX‡-Sordariales Number of genes in NM-Xylariales 332 310 287 298 293 284 not also found in the Sordariales Number of genes in the Sordariales 27 0 3 0 3 8,660 not shared with any Microdochium § Total Mn11037 NA 10,097 9,976 9,991 9,904 9,932 number of Mn12262 10,503 NA 11,116 10,393 10,346 9,847 genes shared Mn10106 10,528 11,141 NA 10,404 10,356 9,848 with specific Mm99049 10,454 10,438 10,333 NA 11,086 9,873 genomes: Mm10095 10,463 10,450 10,339 11,013 NA 9,885 || Total Wheat Mn NA 944 1,049 442 488 NC number of Turf Mn NA 2 6 50 58 NC genes found M. bolleyi 86 1 1 4 1 NA

152

Isolate* Mn11037 Mn12262 Mn10106 Mm99049 Mm10095 Mb07020 only in: All Mn only 93 533 619 2 2 NC All Mm only 65 2 2 405 532 NC Mn and Mm only 224 691 670 679 701 806

* Mn = M. nivale; Mm = M. majus; Mb = M. bolleyi † NX = Non-xylariales ‡ NS = Non-Sordariomycete § NA = Comparison not applicable || NC = Calculation was not performed

153

Table 3.8 Comparisons between the predicted gene sequences from M. nivale, M. majus, and M. bolleyi against each other and against the genomes of 6 (non-Microdochium) members of the Xylariales and 9 non-Xylariales members of the Sordariomycetes.

Comparisons were performed using tBLASTn with an e-value cutoff of 1e-50.

Isolate* Mn11037 Mn12262 Mn10106 Mm99049 Mm10095 Mb07020 Total number of predicted genes 12,060 11,430 11,714 11,343 11,651 13,047 Total number of genes > 100 bp 11,792 11,206 11,224 11,083 11,159 12,519 Number of genes with no hits 1,785 487 641 494 632 3,082 Number of genes in all categories 6,694 6,531 6,495 6,503 6,533 6,652 Number of genes not found in the 2,545 3,529 3,403 3,445 3,329 1,983 NM†-NX‡-Sordariales Number of genes in NM-Xylariales 516 477 462 479 463 482 not also found in the Sordariales Number of genes in the Sordariales 30 8 12 5 9 6,917 not shared with any Microdochium § Total Mn11037 NA 9,126 9,054 9,039 8,982 8,829 number of genes Mn12262 9,464 NA 10,463 9,372 9,288 8,715 shared with Mn10106 9,487 10,622 NA 9,378 9,298 8,711 specific Mm99049 9,418 9,354 9,244 NA 10,411 8,718 genomes: Mm10095 9,436 9,363 9,251 10,511 NA 8,735 || Total Wheat Mn NA 1,405 1,340 446 46 NC number of genes Turf Mn NA 21 32 102 113 NC found only in: M. bolleyi 98 9 6 5 7 NA

154

Isolate* Mn11037 Mn12262 Mn10106 Mm99049 Mm10095 Mb07020 All Mn only 120 1,064 1045 22 38 NC All Mm only 84 9 5 858 880 NC Mn and Mm only 323 627 588 714 645 1,197

* Mn = M. nivale; Mm = M. majus; Mb = M. bolleyi † NX = Non-xylariales ‡ NS = Non-Sordariomycete § NA = Comparison not applicable || NC = Calculation was not performed

155

Table 3.9 Fungal pathogen-host interaction (PHI) genes with highly variable copy numbers among Microdochium predicted gene sets that may play a role in pathogenicity or fungicide resistance (Mb = Microdochium bolleyi; Mm = M. majus; Mn = M. nivale)

Number of Matches Accession Function Function Reference Mb Mm Mm Mn Mn Mn Number 07020 10095 99049 10106 12262 11037 Maackiain AAC49410 (Convert et al. 1996) 0 35 32 38 30 36 detoxification Osmosensing histidine AAL37947 (Cui et al. 2002) 4 16 14 15 14 8 kinase Transcriptional (Elías-Villalobos et MGG_06847 5 16 14 15 14 8 repressor TUP1 al. 2011) BAC67162 ATP transporter (Lee et al. 2005) 15 2 2 2 2 8 MAP kinase kinase (Shim and Dunkle AAP72037 7 18 18 18 14 14 kinase CZK3 2003) Chimeric spermidine (Kingsbury et al. AAS48112 synthase-saccharopine 2 20 16 16 16 16 2004) dehydrogenase

156

Table 3.10 Accession numbers for ten randomly-selected putative unique predicted gene sequences identified in Microdochium spp.

M. majus M. majus M. nivale M. nivale M. nivale M. bolleyi 99049 10095 10106 11037 12262 07020 g1826.t1 g9539.t1 g4451.t1 g11430.t1 g9867.t1 g4309.t1 g6500.t1 g3237.t1 g1418.t1 g2957.t1 g11414.t1 g3190.t1 g10980.t1 g8406.t1 g428.t1 g4027.t1 g1299.t1 g9001.t1 g6974.t1 g1617.t1 g2727.t1 g6734.t1 g1084.t1 g7673.t1 g4340.t1 g8457.t1 g6286.t1 g3740.t1 g3117.t1 g5294.t1 g6884.t1 g9941.t1 g5645.t1 g7583.t1 g6019.t1 g3113.t1 g4702.t1 g8609.t1 g6364.t1 g2240.t1 g8787.t1 g4479.t1 g9196.t1 g8787.t1 g1372.t1 g9519.t1 g730.t1 g3953.t1 g10976.t1 g8461.t1 g10074.t1 g809.t1 g2168.t1 g6740.t1 g7596.t1 g2716.t1 g3741.t1 g5819.t1 g10432.t1 g2416.t1

157

Table 3.11Transposable element sequences downloaded from GenBank that were used in comparisons against the genomes of

Microdochium spp.

Transposon Accession Sequence Class Superfamily* Family* Organism Function Name Number Type AAA88790.1 gag aa skippy Fusarium oxysporum S60179 pol aa AAA33420.1 RT† aa MAGGY Magnaporthe grisea AAA33419.1 gag aa LTR Ty3/gypsy |T18348 pol nt retrotransposons gag and aa pyret M. grisea AB062507.1 pol Colletotrichum. AAG24792.1 pol aa Cgret I gloeosporoides AAG24791.1 gag aa CAA67543.1 RT aa Copia/Ty1- Mars2,3 Ascobolus immersus CAA67545.1 integrase aa like Nht2 Nectria haematococca AY038360.1 RT nt Non-LTR LINES MGL M. grisea AF018033.1 RT nt retrotransposon SINES Foxy F. oxysporum AJ250814.1 insertion nt s repet. nt Unrelated Marsu F. oxysporum AF076630.1 element. Fot1 F. oxysporum EMT73539.1 unknown aa flipper Botrytis cinerea AAB63315.1 transposase aa Pogo TC1 / mariner Pat P. anserina unknown aa II crawler Aspergillus oryzae BAE93244.1 transposase aa Ant1 and aa impala F. oxysporum AAB33090.2 transposase Tc1 hAT-like -§ restless Tolypocladium inflatum CAA93759. transposase aa 158

Transposon Accession Sequence Class Superfamily* Family* Organism Function Name Number Type Mutator-like - mutator Neofusicoccum parvum EOD48650.1 mutator aa Guest N. crassa AY374119.1 unknown nt MITE-like - nt mimp F. oxysporum EU833101.1 mimp4 Activator - VdHAT Verticillium dahliae JN160811.1 unknown nt

* From (Daboussi and Capy 2003) § - no family assigned † - RT = reverse transcriptase

159

Table 3.12 Summary of putative transposable element sequences identified in the Microdochium spp. genomes and their relative proximity to putative pathogen-host interaction genes

Number of TEs Total number of Species Isolate found within 5kb of putative TEs* a PHI gene 11037 328 198 M. nivale 12262 230 153 10106 196 121 99049 240 141 M. majus 10095 276 215 M. bolleyi 07020 317 162 * putative TEs were identified as unique if they were not found within 500 bp of another match to prevent counting the same match twice.

160

1 2 3 4

5 6 7 8

A 9 10 11 12 A C C C C C T G T G

13 14

C C A A T A G A

161

Figure 3.1 Pipeline for DNA sequencing by Illumina-Solexa technology. Genomic DNA (1) is isolated and (2) sheared into millions of fragments. Adapter sequences are ligated onto both sides of all fragments (3). The fragments are introduced to the solid surface (4), which contains sequences that are complementary to the adapters. In the PCR stage (5), the second adapter sequence of each fragment (white) anneals to its complement. The fragment is amplified by

DNA polymerase to produce a complementary strand (6). In each round of PCR, both strands are duplicated (7). The result is clusters of identical complementary sequences (8) for each of the original fragments. In the sequencing stage, a solution containing all four dNTPs, each labelled with a different terminating fluorophore, is introduced (9). Only one nucleotide at a time can be introduced (10) due to the terminator sequence. The incorporated nucleotide is identified by the fluorophore. The fluorophore terminator is cleaved (11) to allow for the addition of the next nucleotide. The labelled dNTP solution is re-applied (12), the second nucleotide is incorporated

(13), and step 11 is repeated. This process is repeated until the full fragment has been sequenced

(14) (Mardis 2008).

162

A B 5’ sheared genomic DNA 3’

100 bp 100 bp

500 bp

Figure 3.2 Summary of paired-end sequencing. Genomic DNA is sheared into fragments of approximately the same length (e.g. approximately 500 bp). This fragment is then sequenced from both the 5' and the 3' ends, generating two "paired" fragments (labelled A and B and joined by a dashed arc). The number of sequencing cycles performed is equal to the lengths of these fragments (e.g. for strands consisting of two paired-end fragments of 100 bp each, a total of 200 cycles would be performed). During the assembly process, the expected distance between these two fragments (in this example, 300 bp) is used to facilitate their association with the other fragments generated during sequencing.

163

GGTCAT GTCATC TCATCG C ... C’ GGATCA CATGGT ATGACC GATCAT ATGGTC GATGAC ATCATG TGGTCA CGATGA A B A’ B’ GGTCAG TGATCC ACCATG GTCAGA ATGATC GACCAT TCATGAG CATGAT TGACCA D ... D’ CTGACC TCTGAC CTCATGA

Figure 3.3 Schematic representation of a de Brujin graph (Zerbino and Birney 2008). Nodes are represented by boxes. Each node consists of a short alignment of overlapping sequences of the same length. Each node also has a sister node consisting of the reverse complement of the sequences and the alignment in its sister, located immediately above or below the node (e.g. A and A' are sister nodes). Nodes are connected based on their sequence similarities: for example, the final sequence in node A shares four of its six nucleotides with the first sequence in node B.

Node B is connected both to node A and to nodes C and D, because the final sequence in node B overlaps equally well with the first sequences in both nodes C and D. The use of de Brujin graphs results in the association of short, overlapping alignments that are used to assemble the sequencing reads into larger contigs or scaffolds.

164

a) b) M. bolleyi M. bolleyi 07020 07020

M. nivale M. nivale 11037 11037 61 97 M. majus M. majus 99049 99049 100 100

M. majus M. majus 10095 10095

M. nivale M. nivale 10106 10106 100 100

M. nivale 0.1 M. nivale 0.1 12262 12262

c) M. bolleyi 07020

M. nivale 11037

M. nivale 100 10106 M. nivale 81 12262

M. majus 99049 100 100 M. majus 10095

Figure 3.4 a) Neighbour-joining, b) maximum likelihood, and c) maximum parsimony trees depicting the relationships between the sequenced Microdochium genomes, based on the concatenated sequences of ten genes that were putatively unique to Microdochium. Bootstrap values (out of 100) are displayed on each node. The ten genes are listed in Table 3.10. Scale bars represent either 0.1 nucleotide change per base (a and b) or 100 substitutions (c). 165

Appendices for Chapter 3

Appendix 3.1 Sample script to execute SOAPdenovo

#!/bin/bash # ./0run_soap.sh 1>soap_25-63.out 2>soap_25-63.errors &

PATH=$PATH:/home/thsiang/programs/soap for ((i=25; i<64; i+=2)) do mkdir kmer$i soap_105_63k all -p 2 -s config.txt -K $i -o ./kmer$i/mm10095_kmer$i done for ((i=25; i<64; i+=2)) do GapCloser -b config.txt -a ./kmer$i/mm10095_kmer$i.scafSeq -o ./kmer$i/mm10095_gclose$i.nt done

166

Appendix 3.2 Sample configuration file for SOAPdenovo

#maximal read length max_rd_len=100 [LIB] #average insert size avg_ins=400 #if sequence needs to be reversed reverse_seq=0 #in which part(s) the reads are used asm_flags=3 #use only first 100 bps of each read rd_len_cutoff=100 #in which order the reads are used while scaffolding rank=1 # cutoff of pair number for a reliable connection (at least 3 for short insert size) pair_num_cutoff=3 #minimum aligned length to contigs for a reliable read location (at least 32 for short insert size) map_len=32 #a pair of fastq file, read 1 file should always be followed by read 2 file q1=/home/thsiang/dna_data/1304_genomes/microdochium_majus10095_R1.fastq.gz q2=/home/thsiang/dna_data/1304_genomes/microdochium_majus10095_R2.fastq.gz #a pair of fasta file, read 1 file should always be followed by read 2 file #f1=/path/**LIBNAMEA**/fasta1_read_1.fa #f2=/path/**LIBNAMEA**/fasta1_read_2.fa #fastq file for single reads #q=/path/**LIBNAMEA**/fastq1_read_single.fq #fasta file for single reads #f=/path/**LIBNAMEA**/fasta1_read_single.fa #a single fasta file for paired reads #p=/path/**LIBNAMEA**/pairs1_in_one_file.fa #bam file for single or paired reads, reads 1 in paired reads file should always be followed by reads 2 # NOTE: If a read in bam file fails platform/vendor quality checks(the flag field 0x0200 is set), itself and it's paired read would be ignored. #b=/path/**LIBNAMEA**/reads1_in_file.bam

167

Appendix 3.3 Sample script used to execute ABySS

#!/bin/bash # ./0abyss.sh -p 2 2>abyss25-63_new.err 1>abyss25-63_new.out &

PATH=$PATH:/home/ljewell/programs/abyss-1.3.5/bin for ((i=25; i<64; i+=2)) do mkdir kmer$i abyss-pe k=$i n=5 name=mb07020_kmer$i in=' /home/thsiang/dna_data/1304_genomes/microdochium_bolleyi07020_R1.fastq.gz /home/thsiang/dna_data/1304_genomes/microdochium_bolleyi07020_R2.fastq.gz' done

168

Appendix 3.4 Sample script used to execute Velvet

# sqsub --mpp=16g -o 99049_Velvet57-65.out -r 2d bash run_Velvet.sh

PATH=$PATH:/work/ljewell/hound/programs/Velvet/ for ((i=57; i<65; i+=2)) do mkdir k$i Velveth ./k$i $i -shortPaired ../majus_paired.fas Velvetg ./k$i -ins_length 200 -exp_cov 20 done

169

Appendix 3.5 Sample script used to execute AUGUSTUS

#!/bin/bash #./0run_agustus.sh 1>aug.out 2>aug.err &

/home/ljewell/programs/augustus/bin/augustus -- AUGUSTUS_CONFIG_PATH=/home/ljewell/programs/augustus/config/ -- species=magnaporthe_grisea ../abyss/kmer63/mm10095_kmer63-scaffolds.fa > scaff_pred_mm10095_a63

Perl /home/ljewell/programs/augustus/scripts/getAnnoFasta.pl scaff_pred_mm10095_a63 --seqfile=../abyss/kmer63/mm10095_kmer63-scaffolds.fa

170

Appendix 3.6 Annotate_genes.pl

#!/usr/bin/perl #Script to parse through FastAnnotate results to associate them with full names from the Pfam table

#script use: ./annotate_genes.pl fa_output pfam_table pred_genes genome #Where: #fa_output is the output of fastannotate #pfam_table is the downloaded table of all pfam accessions #pred_genes is the file that was used for the fastannotate submission #and "genome" is the desired genome name (eg isolate number) to be appended to the name of the gene open $IN, "$ARGV[0]"; open $PFAM, "$ARGV[1]"; open $GENOME, "$ARGV[2]"; open $OUT, "+>$ARGV[2].annotated"; $genome_name = $ARGV[3];

%genome = (); %pfam = (); while (<$GENOME>) { chomp $_; if ($_ =~ />/) { $name = $_; $name =~ s/>//g; $seq = <$GENOME>; chomp $seq; $genome{$name} = $seq; } } while (<$PFAM>) { chomp $_; @line = split(/\t/, $_); $acc = $line[4]; $acc =~ /PF(\d+)\.\d+/; $acc = $1; $pfam{$acc} = $line[6]; } while (<$IN>) { chomp $_; @line = split(/\t/, $_); $gene = $line[0]; $pfam_acc = $line[10];

if ($pfam_acc =~ /-/) { print $OUT ">$genome_name|$gene| Unknown function\n$genome{$gene}\n";

} else { $pfam_acc =~ /pfam(\d+) /; 171

$pfam_acc = $1;

if (exists($pfam{$pfam_acc})) { $desc = $pfam{$pfam_acc}; $desc =~ s/\"//g; print $OUT ">$genome_name|$gene| $desc\n$genome{$gene}\n"; } else { print $OUT ">$genome_name|$gene| Unknown function\n$genome{$gene}\n";

} } }

close $IN; close $PFAM; close $OUT; close $GENOME;

172

Appendix 3.7 parse_m9.pl

#!/usr/bin/perl use warnings; #parse_m9.pl

#Script to parse formatting from BLAST searches conducted with -m9 flag #Note that this script accepts a LIST OF FILES as its input #NOT BLAST outfiles directly #Script use: ./parse_m9.pl list_of_BLAST_outfiles #Output is BLAST_outfile.parsed #(where BLAST_outfile is the name of your raw BLAST output) open $LIST, "<$ARGV[0]"; @list = <$LIST>; for ($i=0; $i<=$#list; $i++) { chomp $list[$i]; open $IN, "<$list[$i]"; open $OUT, "+>$list[$i].parsed";

while (<$IN>) { if ($_ =~ /\#/){ } else { print $OUT "$_"; } } close $OUT; close $IN; } close $LIST;

173

Appendix 3.8 make_simple_table_v2.pl

#!/usr/bin/perl #13.01.01 use warnings; # make_simple_table_v2.pl

#Script use: # ./make_simple_table_v2.pl list names max_eval

#NOTE: max_eval is the MAX exponent allowed, e.g. for e-05 : # ./make_simple_table.pl list names 05 #"list" is a file that contains the filenames of the parsed BLAST results #to make this list: ls *.parsed > list [for example] #"names" is a file that contains the NAMES ONLY of all predicted genes

# This script will parse multiple BLAST output files and will create a master table in the following format:

# species 1 species 2 . . . # gene name hit_exists hit_exists..... # gene 1 0 1 # gene 2 1 0 # . . .

####NOTE#### # This script should be run IN THE FOLDER THAT CONTAINS THE BLAST OUTPUT FILES!! # Blast should be run using the -m 9 flag (outputs results in tabular form) and parsed to remove the comments (script parse_m9.pl)

############################################################################# ########################################################

#Open the list file:

$listname = $ARGV[0]; chomp $listname;

#Throw the filenames into an array: open $LIST, "<$listname"; @list = <$LIST>; close $LIST;

#grab the max evalue from the imput line: $max_eval = $ARGV[2];

#open the list that contains the names of the genes: open $NAMES, "$ARGV[1]"; @gene_names = <$NAMES>; close $NAMES; chomp @gene_names;

#open an outfile and tag it with the evalue for later reference: 174

open $OUT, "+>simple_table.$max_eval.txt";

#Print the table headers into the outfile: print $OUT "Gene_name\t";

#Parse and then print the names of the species that the BLAST was performed against: foreach $species (@list) { chomp $species; $temp_name = $species; $temp_name =~ s/\.parsed//g; print $OUT "$temp_name\t"; } print $OUT "\n\t";

#Print some human-friendly headers: $num_list = 0; until ($num_list > $#list) { print $OUT "hit_exists\t"; $num_list ++; } print $OUT "\n";

#Create a hash that will ultimately become the hash-of-hashes that stores all of the data: %hash = ();

#For each species in turn, open the corresponding BLAST output file: foreach $sp_name(@list) { chomp $sp_name; open ($BLAST, "$sp_name");

#Go line by line through the BLAST output: foreach $entry (<$BLAST>) { chomp $entry;

#Split each line. Grab the name (the 0th entry) and the evalue (the 10th entry): @line = split(/\s+/, $entry); $genename = $line[0]; $eval = $line[10];

#If the evalue is EITHER 0.0 OR if the exponent is GREATER than the max evalue (exp >= evalue) #Store the name of the gene in the hash for the species, which in turn is stored in the global hash: #If this requirement is not met, do nothing. #Note that this means that you only need ONE hit that meets the theshhold value to get a "positive" result in the final table if ($eval =~ /0.0/) { $hash{$sp_name}{$genename} = $eval; }

elsif ($eval =~ /e/) { 175

$eval =~ s/\d+e-//g; if ($eval >= $max_eval) { $hash{$sp_name}{$genename} = $eval; } } } close $BLAST; }

#When all of the data has been stored, parse through it and output the results into a table for further analysis:

#For all of the genes: foreach $name(@gene_names) { #Create a row in the table that starts with the name of the gene: print $OUT "$name\t"; #Now go through each species in the hash of hashes and look for an entry corresponding to the gene of interest: foreach $species (@list) { #If the entry exists, print a value of 1 if ($hash{$species}{$name}) { print $OUT "1\t"; } #If the entry does not exist, print a value of 0 else { print $OUT "0\t"; } } print $OUT "\n"; }

176

Appendix 3.9 summarize_with_files_v2.pl

#!/usr/bin/perl #13.05.06 use warnings;

#This is an updated version of my old "summarize_with_files" script #It takes your binary hit matrix AS WELL AS the species and a host

# Script use: ./summarize_with_files_v2.pl table.txt species host #e.g. ./summarize_with_files_v2.pl simple_table.txt nivale wheat #NOTE: host is optional for majus

#Note that this script runs, at its core, as a series of subroutines #These subroutines are explained more thoroughly within their guts:

$my_species = $ARGV[1]; if ($my_species =~ /nivale/) { $host = $ARGV[2]; if ($host) { } else { die "ERROR: Please enter a host name for M. nivale (wheat or grass)\n"; } } elsif ($my_species =~ /majus/) { $host = "wheat"; }

&determine_relationships(); &open_files(); &define_variables();

$gene_counter = 0; while (<$IN>) { chomp $_; # Check the current line. If it is the first line, it contains a list of the names which much be parsed to determine which data is where # Split the line into an array and examine the second column. If it is the first line, the second column will contain non-digit characters (i.e. \D). # If it is a data line, it will contain either a 0 or 1 (\d)

@temp_array = split (/\s+/, $_);

if ($temp_array[1] =~ /hit/) { }

elsif ($temp_array[1] =~ /\D/) { &identify_columns(); }

177

else { $gene_counter++; &determine_relationships(); &do_math(); &parse_results(); } }

&print_outfiles(); &close_files();

############# #SUBROUTINES# ############# sub do_math {

$num_genes ++; $total = 0; $gene_name = $temp_array[0];

#Tally the results (i.e. the 1's and 0's) to determine the total number of times that a particular gene was found:

for ($z = 1; $z <= $#temp_array; $z ++) { $total += $temp_array[$z]; }

#Immediately check if the sum is zero. If it is, there is no need to perform any of the other checks: if ($total == 0) { $never++; print $NEVER "$gene_name\n"; next; }

#Determine the "exclusivity" of the hit: $sister_sum = 0; foreach $sis (@sister_locations) { $sister_sum+=$temp_array[$sis]; }

$intra_sum = 0; foreach $intra (@intra_locations) { $intra_sum += $temp_array[$intra]; }

$wheat_sum = 0; foreach $wheat (@wheat_locations) { $wheat_sum += $temp_array[$wheat]; }

$grass_sum = 0; foreach $grass (@grass_locations) { $grass_sum += $temp_array[$grass]; 178

}

$mb_sum = 0; foreach $mb (@mb_locations) { $mb_sum += $temp_array[$mb]; }

$all_mic_sum = $sister_sum + $intra_sum + $mb_sum;

$other_sum = 0; foreach $sord (@other_locations) { $other_sum += $temp_array[$sord]; } $maj_sum = 0; foreach $maj (@majus_locations) { $maj_sum += $temp_array[$maj]; } $niv_sum = 0; foreach $niv (@nivale_locations) { $niv_sum += $temp_array[$niv]; } $xyl_sum = 0; foreach $xyl (@xyl_locations) { $xyl_sum += $temp_array[$xyl]; }

$mm99049_sum = 0; foreach $mm99049 (@mm99049_locations) { $mm99049_sum += $temp_array[$mm99049]; } $mm10095_sum = 0; foreach $mm10095 (@mm10095_locations) { $mm10095_sum += $temp_array[$mm10095]; } $mn12262_sum = 0; foreach $mn12262 (@mn12262_locations) { $mn12262_sum += $temp_array[$mn12262]; } $mn10106_sum = 0; foreach $mn10106 (@mn10106_locations) { $mn10106_sum += $temp_array[$mn10106]; } $mn11037_sum = 0; foreach $mn11037 (@mn11037_locations) { $mn11037_sum += $temp_array[$mn11037]; } }

################################################################### sub parse_results {

#First some general stuff: if ($mm99049_sum != 0) { $total_in_99049++; 179

} if ($mm10095_sum != 0) { $total_in_10095++; } if ($mn12262_sum != 0) { $total_in_12262++; } if ($mn11037_sum != 0) { $total_in_11037++; } if ($mn10106_sum != 0) { $total_in_10106++; }

if (($my_species =~ /nivale/) || ($my_species =~ /majus/)){ if ($intra_sum != 0) { $total_in_intra++; } if ($sister_sum != 0) { $total_in_sister++; } if ($mb_sum != 0) { $total_in_mb++; } }

elsif (($my_species =~ /bolleyi/)){ if ($niv_sum != 0) { $total_in_niv++; } if ($maj_sum != 0) { $total_in_maj++; } }

if ($other_sum != 0) { $total_in_sord++; } if ($xyl_sum != 0) { $total_in_xyl++; }

#If there are no "other" hits (i.e. no hits in a non-xylariales...) if ($other_sum == 0) { #If there are no hits in the NON-MIC xylariales, this means that the only hits are among the mic species $not_in_sord++;

############################################################################# ###########

if ($xyl_sum != 0) { 180

$in_any_xyl++; if (($niv_sum == 0) && ($maj_sum == 0) && ($mb_sum == 0)){ print $OTHER_XYL_ONLY "$gene_name\n"; $in_other_xyl_only++; } }

if (($my_species =~ /nivale/) || ($my_species =~ /majus/)){

if (($xyl_sum == 0) && ($mb_sum == 0)) {

if (($intra_sum == 0) && ($sister_sum != 0)) { print $SISTER "$gene_name\n"; $in_sister_only++; }

if (($intra_sum != 0) && ($sister_sum == 0)) { print $INTRA "$gene_name\n"; $in_intra_only++; } if (($sister_sum != 0) && ($intra_sum != 0)) { print $NIV_MAJ_ONLY "$gene_name\n"; $in_niv_and_maj_only++; } }

if (($grass_sum != 0) && ($wheat_sum == 0)) { print $NIV_GRASS "$gene_name\n"; $only_niv_grass++; }

if (($grass_sum == 0 ) && ($wheat_sum != 0)) { print $NIV_WHEAT "$gene_name\n"; $only_niv_wheat++; }

if (($xyl_sum == 0) && ($sister_sum == 0) && ($intra_sum == 0) && ($mb_sum != 0)) { print $MB "$gene_name\n"; $in_mb_only++; }

if (($xyl_sum == 0) && ($sister_sum != 0) && ($intra_sum != 0) && ($mb_sum != 0)) { print $MIC_ONLY "$gene_name\n"; $in_all_mic_only++; }

if (($xyl_sum != 0) && ($sister_sum != 0) && ($intra_sum != 0) && ($mb_sum != 0)) { print $ALL_XYL "$gene_name\n"; $in_all_xyl++; } }

181

############################################################################# ###########

elsif ($my_species =~ /bolleyi/){ if (($xyl_sum == 0) && ($niv_sum != 0) && ($maj_sum != 0)) { print $SISTER "$gene_name\n"; $in_sister_only++; } if (($xyl_sum == 0) && ($maj_sum != 0 ) && ($niv_sum == 0)) { $in_maj_only++; } if (($xyl_sum == 0) && ($maj_sum == 0 ) && ($niv_sum != 0)) { $in_niv_only++; } if (($xyl_sum != 0) && ($niv_sum != 0) && ($maj_sum != 0)) { print $ALL_XYL "$gene_name\n"; $in_all_xyl++; } }

############################################################################# ###########

} ####################################### # else, if there ARE hits in the sord:# #######################################

else { if (($my_species =~ /nivale/) || ($my_species =~ /majus/)){

if (($xyl_sum !=0) && ($intra_sum != 0) && ($sister_sum != 0) && ($mb_sum !=0)) { print $IN_ALL "$gene_name\n"; $in_all++; } if (($xyl_sum != 0) && ($intra_sum != 0) && ($mb_sum != 0) && ($sister_sum == 0 )) { print $NOT_IN_SIS "$gene_name\n"; $not_in_sister++; } if (($xyl_sum !=0) && ($intra_sum == 0) && ($mb_sum !=0) && ($sister_sum !=0 )) { print $NOT_IN_INTRA "$gene_name\n"; $not_in_intra++; } }

elsif ($my_species =~ /bolleyi/) { if (($xyl_sum !=0 ) && ($niv_sum != 0) && ($maj_sum != 0)) { print $IN_ALL "$gene_name\n"; $in_all++; 182

} }

if (($xyl_sum ==0) && ($all_mic_sum == 0)) { print $SORD_ONLY "$gene_name\n"; $in_sord_only++; } if (($xyl_sum !=0) && ($all_mic_sum == 0)) { print $IN_OTHER_ONLY "$gene_name\n"; $in_other_only++; } } }

#################### sub determine_relationships {

#The locations of the species of interest were determined by the "identify_columns" subroutine

$my_species = $ARGV[1];

if ($my_species =~ /majus/) { $sister_name = "nivale"; @intra_locations = @majus_locations; @sister_locations = @nivale_locations; }

elsif ($my_species =~ /nivale/) { $sister_name = "majus"; @intra_locations = @nivale_locations; @sister_locations = @majus_locations; } elsif ($my_species =~ /bolleyi/) { $sister_name = "niv_and_maj"; @intra_locations = (); @sister_locations = (); }

else { die "ERROR: species name was not found. Please enter the species name (majus, nivle, or bolleyi) in lowercase letters.\n"; } }

sub identify_columns { &define_groups();

@majus_locations = (); @nivale_locations = (); @mb_locations = (); @xyl_locations = (); @other_locations = (); 183

@wheat_locations = (); @grass_locations = ();

#Start at 1 (not 0) becauuse first column is "gene_name" for ($a = 0; $a <= $#temp_array; $a++) {

foreach $sp (@xylariales) { if ($temp_array[$a] =~ /$sp/) { push (@xyl_locations, $a); } } foreach $sp (@mb) { if ($temp_array[$a] =~ /$sp/) { push (@mb_locations, $a); } } foreach $sp (@majus) { if ($temp_array[$a] =~ /$sp/) { push (@majus_locations, $a); } } foreach $sp (@nivale) { if ($temp_array[$a] =~ /$sp/) { push (@nivale_locations, $a); } } foreach $sp (@wheat) { if ($temp_array[$a] =~ /$sp/) { push (@wheat_locations, $a); } } foreach $sp (@grass) { if ($temp_array[$a] =~ /$sp/) { push (@grass_locations, $a); } } foreach $sp (@other) { if ($temp_array[$a] =~ /$sp/) { push (@other_locations, $a); } }

foreach $sp (@mm99049) { if ($temp_array[$a] =~ /$sp/) { push (@mm99049_locations, $a); } } foreach $sp (@mm10095) { if ($temp_array[$a] =~ /$sp/) { push (@mm10095_locations, $a); } } foreach $sp (@mn11037) { 184

if ($temp_array[$a] =~ /$sp/) { push (@mn11037_locations, $a); } } foreach $sp (@mn12262) { if ($temp_array[$a] =~ /$sp/) { push (@mn12262_locations, $a); } } foreach $sp (@mn10106) { if ($temp_array[$a] =~ /$sp/) { push (@mn10106_locations, $a); } } } }

##### sub print_outfiles { #This prints a quick, human-readable summary of the results: print $OUT "For $ARGV[0]: \n"; print $OUT '################'; print $OUT "\n"; print $OUT "A total of $gene_counter genes were assessed.\n"; print $OUT "A total of $never genes had no match in any of the genomes studied.\n"; print $OUT "There were $in_all genes that were in ALL categories.\n"; print $OUT "There were $not_in_sord genes NOT in the non-xylariales sordariales.\n"; print $OUT "There were $in_any_xyl genes NOT in a sordariales that WERE in the xylariales (either unique to xyl, unique to mic, or not shared at all).\n"; print $OUT '################'; print $OUT "\n"; print $OUT "There were $in_other_only genes that were only found in the NON-Microdochium species.\n"; print $OUT "There were $in_other_xyl_only genes that were ONLY found among the NON-Microdochium Xylariales.\n\n"; print $OUT "There were $in_all_xyl genes that were in ALL of the xylariales, including the Microdochium species\n"; print $OUT '################'; print $OUT "\n"; print $OUT "There were $in_niv_and_maj_only genes that were only found in BOTH M. nivale and M. majus but no other species.\n"; print $OUT "There were $in_mb_only genes that were only found in M. bolleyi.\n"; print $OUT '################'; print $OUT "\n"; print $OUT "There were $in_sister_only genes in the sister species ($sister_name) ONLY.\n"; print $OUT "There were $in_intra_only genes that were found only in intraspecfic genomes ($my_species) ONLY.\n"; 185

print $OUT '################'; print $OUT "\n"; print $OUT "Among ALL of the hits shared with M.nivale:\n"; print $OUT "\tthere were $only_niv_wheat genes that were shared ONLY with the wheat isolates.\n"; print $OUT "\tthere were $only_niv_grass genes that were shared ONLY with the grass isolates.\n"; print $OUT '################'; print $OUT "\n"; print $OUT "There were $not_in_intra genes that were found in every species EXCEPT in the intra-specific genomes ($my_species).\n"; print $OUT "There were $not_in_sister genes that were found in every species EXCEPT the sister species genomes ($sister_name).\n"; if ($my_species =~ /bolleyi/) { print $OUT '################'; print $OUT "\n"; print $OUT "There were $in_niv_only genes found ONLY among the M. nivale isolates. \n"; print $OUT "There were $in_maj_only genes found ONLY among the M. majus isolates. \n"; print $OUT "There were $total_in_niv TOTAL genes found among the M. nivale isolates. \n"; print $OUT "There were $total_in_maj TOTAL genes found among the M. majus isolates. \n";

} if (($my_species =~ /majus/) ||($my_species =~ /nivale/)) { print $OUT '################'; print $OUT "\n"; print $OUT "There were $total_in_sister TOTAL genes found among the $sister_name isolates. \n"; print $OUT "There were $total_in_intra TOTAL genes found among the $my_species isolates. \n"; print $OUT "There were $total_in_mb TOTAL genes found in M. bolleyi.\n";

} print $OUT '################'; print $OUT "\n"; print $OUT "There were $total_in_10095 TOTAL genes found in M. majus 10095.\n"; print $OUT "There were $total_in_11037 TOTAL genes found in M. nivale 11037.\n"; print $OUT "There were $total_in_12262 TOTAL genes found in M. nivale 12262.\n"; print $OUT "There were $total_in_10106 TOTAL genes found in M. nivale 10106.\n"; print $OUT "There were $total_in_99049 TOTAL genes found in M. majus 99049.\n";

print $OUT "There were $total_in_sord TOTAL genes found among the non- Xylariales and non-Microdochium Sordariomycetes.\n"; print $OUT "There were $total_in_xyl TOTAL genes found among the non- Microdochium Xylariales.\n"; } 186

############################# ## Basic file handle stuff ## ############################# sub open_files { open ($IN, "$ARGV[0]"); open ($OUT, "+>$ARGV[0].summary");

open ($MIC_ONLY, "+>only_in_microdochium.txt"); open ($INTRA, "+>only_in_$my_species.txt"); open ($SISTER, "+>only_in_$sister_name.txt"); open ($NIV_MAJ_ONLY, "+>only_in_niv_and_maj.txt"); open ($MB, "+>only_in_mb.txt"); open ($NIV_WHEAT, "+>niv_wheat_only.txt"); open ($NIV_GRASS, "+>niv_grass_only.txt");

open ($NEVER, "+>never_found.txt"); open ($IN_ALL, "+>in_all_groups.txt"); open ($SORD_ONLY, "+>not_in_any_xyl.txt"); open ($OTHER_XYL_ONLY, "+>only_in_other_xyl.txt"); open ($ALL_XYL, "+>in_all_xyl.txt"); open ($IN_OTHER_ONLY, "+>in_non_mic_only.txt");

open ($NOT_IN_SIS, "+>not_in_$sister_name.txt"); open ($NOT_IN_INTRA, "+>not_in_$my_species.txt"); } sub close_files { close $IN; close $OUT;

close $MIC_ONLY; close $INTRA; close $SISTER; close $NIV_MAJ_ONLY; close $MB; close $NIV_WHEAT; close $NIV_GRASS;

close $NEVER; close $IN_ALL; close $SORD_ONLY; close $OTHER_XYL_ONLY; close $ALL_XYL; close $IN_OTHER_ONLY;

close $NOT_IN_SIS; close $NOT_IN_INTRA; }

################### ## Define groups ## ###################

187

sub define_groups { @xylariales = ( "astygium.fa", "deschscholzii.fa", "hypoxylon_co.fa", "ptheae.nt", );

@nivale = ( "mn10106_s63.80", "mn12262_abyss53", "mn11037_abyss59", );

@wheat = ( "mn10106_s63.80", "mn12262_abyss53", );

@grass = ( "mn11037_abyss59", );

@majus = ( "mm10095_a63.nt", "mm99049_abyss45", );

@mb = ( "mb07020_s61.80.nt", ); @other = ( "aalcalophilum.nt", "cgram.nt", "chaetom.nt", "fusoxy.nt", "fusvert.nt", "gclavigera.nt", "moryzae.nt", "mpoae.nt", "ncrassa.nt", "sthermophile.nt", "tterrestris.nt", "valboatrum.nt", "vdahliae.nt", "vlongispor.nt", );

@mn12262 = ( "mn12262_abyss53", ); @mn10106 = ( "mn10106_s63.80", ); 188

@mn11037 = ( "mn11037_abyss59", );

@mm10095 = ( "mm10095_a63.nt", ); @mm99049 = ( "mm99049_abyss45", ); }

############# sub define_variables { $num_genes = 0; $in_sister_only = 0; $in_all = 0; $in_other_only = 0; $never = 0; $in_mb_only = 0; $in_niv_and_maj_only = 0; $in_intra_only = 0; $not_in_sister = 0; $not_in_intra = 0; $not_in_sord = 0; $in_other_xyl_only = 0; $in_any_xyl = 0; $in_all_xyl = 0; $in_all_mic_only = 0; $in_sord_only = 0; $in_other_only = 0; $only_niv_wheat = 0; $only_niv_grass = 0; $in_niv_only = 0; $in_maj_only = 0; $total_in_mb = 0; $total_in_sister = 0; $total_in_intra = 0; $total_in_niv = 0; $total_in_maj = 0; $total_in_xyl = 0; $total_in_sord = 0; $total_in_99049 = 0; $total_in_10095 = 0; $total_in_11037 = 0; $total_in_12262 = 0; $total_in_10106 = 0; } Appendix 3.10 find_genes_of_diff_length.pl

#!/usr/bin/perl

#Script to identify BLAST hits that are both a) highly conserved, and b) differ in length 189

#Script use: # ./find_genes_of_diff_length.pl BLAST_output.parsed #NOTE: BLAST should be run with the m-9 flag and results "pre-parsed" with the parse_m9 script open $IN, "$ARGV[0]"; open $OUT, "+>$ARGV[0].diff_length";

#Print a human-friendly header: print $OUT "Queryname\tHitname\tDiff\tQlength\tHlength\tMatch_length\n";

#Query id, Subject id, % identity, alignment length, mismatches, gap openings, q. start, q. end, s. start, s. end, e-value, bit score while (<$IN>) { #line by line, look through the BLAST output and examine first the evalue: chomp $_; @line = split(/\t/, $_); #Evalue is entry 10; if the evalue is 0.0, do the length calculation: if ($line[10] =~ /0.0/) { &calculate(); } #Alternatively, if the evalue is <= e-50 (i.e. if the exponenet is >= 50), do the calculation: elsif ($line[10] =~ /e-\d+/) { $line[10] =~ /e-(\d+)/; if ($1 >= 50) { &calculate(); } } } close $IN; close $OUT;

#To determine the length difference, calculate the length of the query and the length of the hit: sub calculate { $qlength = abs($line[7] - $line[6]); $hlength = abs($line[9] - $line[8]); #If the absolute value of the difference between these lengths is greater than 50, grab this data for further analysis: $difference = abs($qlength - $hlength); if ($difference > 50) { print $OUT "$line[0]\t$line[1]\t$difference\t$qlength\t$hlength\t$line[3]\n"; } }

190

Appendix 3.11 compare_phi_results.pl

#!/usr/bin/perl use warnings;

#Script to compare the results of the phi BLAST jobs

#script use: ./compare_phi_results.pl list phi.fasta #where list is a list of all the parsed BLAST files and phi.fasta is all of the protein sequences searched open $LIST, "$ARGV[0]"; @list = <$LIST>; close $LIST;

&open_infiles(); &open_outfiles(); open $PHI, "$ARGV[1]"; %names = (); while (<$PHI>) { chomp $_; if ($_ =~ />/) { $_ =~ s/>//g; $names{$_} = $_; } } close $PHI;

#doing a hash first will strip duplicates... @names = keys %names;

&do_comparison (); &close_outfiles();

sub do_comparison {

foreach $name (@names) {

if ( ($mn12262{$name}) && ($mn10106{$name}) && ($mn11037{$name}) && ($mm99049{$name}) && ($mm10095{$name}) && ($mb07020{$name}) ) { print $IN_ALL "$name\n"; }

elsif ( ($mn12262{$name}) && ($mn10106{$name}) && ($mn11037{$name}) && ($mm99049{$name}) && ($mm10095{$name}) ) { unless ($mb07020{$name}) { print $NOT_MB "$name\n"; } }

elsif ( ($mm99049{$name}) && ($mm10095{$name}) ) { 191

unless ( ($mn12262{$name}) || ($mn10106{$name}) || ($mn11037{$name}) || ($mb07020{$name}) ){ print $MAJ_ONLY "$name\n"; } }

elsif ( ($mn12262{$name}) && ($mn10106{$name}) && ($mn11037{$name}) ) { unless ( ($mm99049{$name}) || ($mm10095{$name}) || ($mb07020{$name}) ){ print $NIV_ONLY "$name\n"; } }

elsif ( ($mn12262{$name}) && ($mn10106{$name}) ) { unless ( ($mn11037{$name}) || ($mm99049{$name}) || ($mm10095{$name}) || ($mb07020{$name}) ){ print $NIV_W_ONLY "$name\n"; } }

elsif ( ($mn12262{$name}) ) { unless ( ($mn10106{$name}) || ($mn11037{$name}) || ($mm99049{$name}) || ($mm10095{$name}) || ($mb07020{$name}) ) { print $MN12262_ONLY "$name\n";

elsif ( ($mn10106{$name}) ) { unless ( ($mn12262{$name}) || ($mn11037{$name}) || ($mm99049{$name}) || ($mm10095{$name}) || ($mb07020{$name}) ) { print $MN10106_ONLY "$name\n"; } }

elsif ( ($mn11037{$name}) ) { unless ( ($mn10106{$name}) || ($mn12262{$name}) || ($mm99049{$name}) || ($mm10095{$name}) || ($mb07020{$name}) ) { print $MN11037_ONLY "$name\n"; } }

elsif ( ($mm99049{$name}) ) { unless ( ($mn10106{$name}) || ($mn11037{$name}) || ($mn12262{$name}) || ($mm10095{$name}) || ($mb07020{$name}) ) { print $MM99049_ONLY "$name\n"; } }

elsif ( ($mm10095{$name}) ) { unless ( ($mn10106{$name}) || ($mn11037{$name}) || ($mm99049{$name}) || ($mn12262{$name}) || ($mb07020{$name}) ) { print $MM10095_ONLY "$name\n"; 192

} }

elsif ( ($mb07020{$name}) ) { unless ( ($mn10106{$name}) || ($mn11037{$name}) || ($mm99049{$name}) || ($mm10095{$name}) || ($mn12262{$name}) ) { print $MB07020_ONLY "$name\n"; } } else { unless ( ($mn12262{$name}) || ($mn10106{$name}) || ($mn11037{$name}) || ($mm99049{$name}) || ($mm10095{$name}) || ($mb07020{$name}) ) { print $NEVER "$name\n"; } } }

$tot_11037 = keys %mn11037; $tot_07020 = keys %mb07020; $tot_12262 = keys %mn12262; $tot_10095 = keys %mm10095; $tot_10106 = keys %mn10106; $tot_99049 = keys %mm99049;

print $TOTALS "There were $tot_11037 in 11037, $tot_07020 in 07020, $tot_12262 in 12262, $tot_10106 in 10106, $tot_10095 in 10095, and $tot_99049 in 99049 \n";

}

sub close_outfiles { close $MN11037_ONLY; close $MN12262_ONLY; close $MN10106_ONLY; close $MM99049_ONLY; close $MM10095_ONLY; close $MB07020_ONLY; close $NEVER; close $NOT_MB; close $IN_ALL;

close $MAJ_ONLY; close $NIV_W_ONLY; close $NIV_ONLY; close $TOTALS; } sub open_outfiles { open $MN11037_ONLY, "+>in_mn11037_only"; open $MN12262_ONLY, "+>in_mn12262_only"; open $MN10106_ONLY, "+>in_mn10106_only"; open $MM99049_ONLY, "+>in_mm99049_only"; open $MM10095_ONLY, "+>in_mm10095_only"; 193

open $MB07020_ONLY, "+>in_mb07020_only"; open $NEVER, "+>never_found"; open $NOT_MB, "+>missing_from_mb_only"; open $IN_ALL, "+>in_all";

open $MAJ_ONLY, "+>in_maj_only"; open $NIV_W_ONLY, "+>in_niv_wheat_only"; open $NIV_ONLY, "+>in_all_niv_only"; open $TOTALS, "+>total_found_summary"; } sub open_infiles { foreach $file (@list) { open $IN, "$file"; if ($file =~ /12262/) { %mn12262 = (); while (<$IN>) { @in = split(/\t/, $_); $mn12262{ $in[0] } = $in[0]; } } elsif ($file =~ /10106/) { %mn10106 = (); while (<$IN>) { @in = split(/\t/, $_); $mn10106{ $in[0] } = $in[0]; } } elsif ($file =~ /11037/) { %mn11037 = (); while (<$IN>) { @in = split(/\t/, $_); $mn11037{ $in[0] } = $in[0]; } } elsif ($file =~ /99049/) { %mm99049 = (); while (<$IN>) { @in = split(/\t/, $_); $mm99049{ $in[0] } = $in[0]; } } elsif ($file =~ /10095/) { %mm10095 = (); while (<$IN>) { @in = split(/\t/, $_); $mm10095{ $in[0] } = $in[0]; } } elsif ($file =~ /07020/) { %mb07020 = (); while (<$IN>) { @in = split(/\t/, $_); $mb07020{ $in[0] } = $in[0]; } 194

} close $IN; } }

195

Appendix 3.12 check_proximity.pl

#!/usr/bin/perl #script use: ./check_proximity.pl phi_output te_output 5000 #Where the final argument is the desired maximum distance between the hits to be counted as a "positive" result #note that blast jobs should be performed using the -m9 flag to produce a tabular output open $PHI, "$ARGV[0]"; open $TE, "$ARGV[1]"; $distance = $ARGV[2]; $ARGV[0] =~ s/_vs_phi_m9//g; open $OUT, "+>$ARGV[0].compared.$distance"; #Print a human-friendly header: print $OUT "scaff_name\tphi_hit\tphi_start\tphi_end\tte_name\tte_start\tte_end\n";

#Create a hash that will store the information about the locations of the TEs: %te = ();

#Read through the output file for the TE blast results. #Put the results in a hash, where the keys are the scaffolds with matches, and the values are arrays #Each element of the array consists of the TE name and its start and end positions in the scaffold. while ($line = <$TE>) { chomp $line; #For each non-comment line, split the line at the tabs to create an array: unless ($line =~ /\#/) { @line = split(/\t/, $line);

#Ensure that the start and end positions are in the desired order (may be inveted $ if ($line[8] > $line[9]) { $location = "$line[0]\t$line[9]\t$line[8]\t"; } else { $location = "$line[0]\t$line[8]\t$line[9]"; }

#If at least one match has already been recorded on this scaffold, add this to the$ if ( exists($te{$line[1]} )) { push(@{$te{$line[1]}}, $location); } else { $te{$line[1]} = [$location,]; } } } 196

#For the pathogen-host interaction gene hits (PHI hits), read through each non-comment line: while (<$PHI>) { chomp $_; unless ($_ =~ /\#/) { @entry = split(/\t/, $_); $hit_name = $entry[1]; if ($entry[8] > $entry[9]) { $start = $entry[9]; $end = $entry[8]; } else { $start = $entry[8]; $end = $entry[9]; }

#If there is at least one TE match on the same scaffold as the current PHI scaff, then:

if (exists($te{$hit_name})) { foreach (@{$te{$hit_name}}) { @temp = split(/\t/, $_); #If the start of one match is within 10 kb of the end of the other, print: if ( (abs($start - $temp[2]) <= $distance ) || (abs($end - $temp[1]) <= $distance ) ) { print $OUT "$hit_name\t$entry[0]\t$start\t$end\t$temp[0]\t$temp[1]\t$temp[2]\n"; } } } } } close $PHI; close $TE; close $OUT;

197

Appendix 3.13 eliminate_duplicates.pl open $IN, "$ARGV[0]"; open $OUT, "+>$ARGV[0].no_duplicates2";

%hits = (); while (<$IN>) { chomp $_; @line = split(/\t/, $_);

$te = $line[4]; $te =~ /\w+_(\w+)/; $te = $1;

$base = 10000; $rounded = int($line[6]/$base+1) * $base;

$scaf = $line[0];

$concat = $scaf . $te . $rounded; $hits{$concat} = $_;

} print $OUT "$hits{$_}\n" for (keys %hits); close $IN; close $OUT;

198

Appendix 3.14 Alignment of putative M. bolleyi 07020 EF-1α sequence with the

Microdochium EF-1α primers EFNivF and EFMajF and the reverse complement of primer

EFMicR

g6339.t1 ATGGGTAAATCCGACAAGGCTCACATCAACGTCGTCGTTATCGGCCACGTCGATTCCGGC g6339.t1 AAGTCCACCACCACCGGTCACTTGATCTACAAGTGCGGTGGTATCGACAAGCGTACCATC g6339.t1 GAGAAGTTCGAGAAGGAAGCTGCCGAGCTCGGCAAGGGTTCCTTCAAGTATGCGTGGGTT g6339.t1 CTTGACAAGCTCAAGGCCGAGCGTGAGCGTGGTATCACCATCGACATTGCCCTCTGGAAG g6339.t1 TTCGAGACTCCCAAGTACTATGTCACCGTCATTGACGCCCCCGGTCACCGTGATTTCATC g6339.t1 AAGAACATGATCACTGGTACTTCCCAGGCCGATTGCGCCATTCTCATCATTGCCGCTGGT g6339.t1 ACTGGTGAGTTCGAGGCTGGTATCTCCAAGGATGGCCAGACTCGTGAGCACGCCCTGCTC g6339.t1 GCCTACACCCTCGGTGTCAAGCAGCTCATCGTCGCCATCAACAAGATGGACACCACCAAG g6339.t1 TGGTCCGAGTCTCGTTTCCAGGAGATCATCAAGGAGACCTCCTCCTTCATCAAGAAGGTC g6339.t1 GGCTACAACCCCAAGCAGGTCGCTTTCGTCCCCATTTCCGGCTTCAACGGCGACAACATG g6339.t1 CTTGAGCCCTCCCCCAACTGCCCCTGGTACAAGGGCTGGGAGAAGGAGATCGGCGGCACC g6339.t1 AAGTCCTCCGGCAAGACCCTTCTTGAGGCCATCGACTCCATCGAGACCCCCAAGCGTCCC EFMicR_revc CCATCGACTCCATCGA g6339.t1 TCCGACAAGCCCCTCCGCCTTCCCCTCCAGGATGTCTACAAGATCGGTGGTATTGGCACG g6339.t1 GTGCCCGTCGGCCGTATCGAGACCGGTACCATCAAGCCCGGCATGGTCGTCACCTTCGCC g6339.t1 CCCGCTGGTGTCACCACTGAAGTCAAGTCCGTCGAGATGCACCACGAGTCTCTCCCCGAG g6339.t1 GCTTTCCCCGGTGACAACGTCGGCTTCAACGTCAAGAACGTGTCCGTCAAGGACATTCGT g6339.t1 CGTGGCAACGTTGCCGGTGACACCAAGAACGACCCCCCGTTGGGCGCCAACACCTTCACC g6339.t1 GCCCAGGTCATCGTCCTGAACCACCCTGGCCAGGTCGGTGCCGGTTACGCCCCTGTTCTC EFMajF CCCCTTCTCCC EFNivF GTTCCCCTGTCT g6339.t1 GACTGCCACACTGCCCACATTGCTTGCAAGTTCACCGAGCTCCTCGAGAAGATCGACCGC EFMajF TATCGC EFNivF GACTGTTGT g6339.t1 CGTACCGGTAAGGCTACCGAGACCAGCCCCAAGTTCATCAAGTCTGGTGATGCCGCCATC g6339.t1 GTCAAGATGACTCCCTCCAAGCCCATGTGCGTTGAGGCTTTCACCGACTACCCTCCCTTG

199

g6339.t1 GGCCGTTTCGCCGTCCGTGACATGAGACAGACCGTCGCTGTCGGTGTCATCAAGGCCGTC g6339.t1 GACAAGTCCCAGGACTCTGGCAAGAAGACCAAGTCTGCTGAGAAGAAGCTTGGCAAGAAG

200

Appendix 3.15 Alignment of predicted gene sequences that are putatively unique to

Microdochium spp.

Mn12262 ATGTTCTCC------ACCTCCGCCCTCGCGGCTGTCTTCGCCGCGGCCTCGCTGTTC Mm10095 ATGTTCTCCCCCTCTGTCACCTCCGCCCTTGCGGCAGTCCTCGCAGCGACCTCGCTGTTT Mm99049 ATGTTCTCCCCCTCTGTCACCTCCGCCCTTGCGGCAGTCCTCGCAGCGACCTCGCTGTTT Mn11037 ATGTTCTCCCCTTCGCTCTCCTCGGCCCTCGCGGCGGTCTTCGCCGCCACCTCACTGCTC Mn10106 ATGTTCTCC------ACCTCCGCCCTCGCGGCTGTCTTCGCCGCGGCCTCGCTGTTC Mb07020 ATGCTCTCCT------TCGCCCACATCGCCACGACGATCCTCGCCGCTGCCTCCATCTTG *** ***** ** * ** * ** **** ** **** * *

Mn12262 AGCACAGCGCAAGCCAACTTCGACGTTTACCGCACCGAGCTCTACTCGGGCAACAGACCC Mm10095 GGCACGGCACAAGCCAACTTTGACGTCTACCGCACCGAGCTCTACTCGGGCAACAGGCCT Mm99049 GGCACGGCACAAGCCAACTTTGACGTCTACCGCACCGAGCTCTACTCGGGCAACAGGCCT Mn11037 GGCACGGCACAAGCCAACTTCGACATCTATCGCACCGAGATATACTCGGGTAACAGACCC Mn10106 AGCACAGCGCAAGCCAACTTCGACGTTTACCGCACCGAGCTCTACTCGGGCAACAGACCC Mb07020 GGCACGGCGCAGGCCAACTTTGACGTCTACCGCGTCGAGCTCTTCTCGGGCCGCCGGCCC **** ** ** ******** *** * ** *** **** * * ****** * * **

Mn12262 GCCATCCTCTGGCAATTCTGGGAGGCCGAAGCCCCCAAGGACTGCGGCGCCATTGTGAAG Mm10095 GCCATCCTGTGGCAATTCTGGGAGGCCGAGGCGCCCAAGGACTGCAACGCCATCATAAAG Mm99049 GCCATCCTGTGGCAATTCTGGGAGGCCGAGGCGCCCAAGGACTGCAACGCCATCATAAAG Mn11037 TCCATCACCTGGCAGTTTTGGGAGGCCGAGGCGCCCAAGGACTGCGGCGCCATCCTGAGG Mn10106 GCCATCCTCTGGCAATTCTGGGAGGCCGAAGCACCCAAGGACTGCGGCGCCATTGTGAAG Mb07020 GCCGTCATGTGGCAGTTCTGGGAGGCCGAGGCGCCCAAGGACTGTGGCGCGATCCTCAAG ** ** ***** ** *********** ** *********** *** ** * * *

Mn12262 AACATGCAGTACGAGGAACTCAACAACAAGGGACC---GTGGGACATTTTCTGGGGTGTC Mm10095 AACATGCAGTACGAGGAACTCAACAACAAGGGCCC---GTGGGACATTTTCTGGGGTGTG Mm99049 AACATGCAGTACGAGGAACTCAACAACAAGGGCCC---GTGGGACATTTTCTGGGGTGTG Mn11037 AACCAGGCGTACGAGGAGCTCAACAACAAGGGGCC---ATGGGACATTTTCTGGGGCGTC Mn10106 AACATGCAGTACGAGGAACTCAACAACAAGGGACC---GTGGGACATTTTCTGGGGTGTC Mb07020 AGGCAGATGTACGAGGAGCTCAACAACAAGGGGCCCTGGTGGGCCATTTTCTGGGGCGTG * * ********* ************** ** **** ************ **

Mn12262 CACTGCACCGGCAGCGGGTGCAGCAG---CCTCGACCCCCCGGGCAACATCGATGTGCTG Mm10095 CACTGTACCGGTAGCGGGTGCGACGA---CCTCAACCCCCCGGGTGACATTGATATCCTG Mm99049 CACTGTACCGGTAGCGGGTGCGACGA---CCTCAACCCCCCGGGTGACATTGATATCCTG Mn11037 CACTGCACCGGCAGCGGGTGCGACGA---CCTCAACCCCCCGGGCAACATCGATGTGCTC Mn10106 CACTGCACCGGCAGCGGGTGCAGCAG---CCTCGACCCCCCGGGCAACATCGATGTGCTG Mb07020 CACTGCACGGGCAGCGGCTGCGACAAGAGCCTGAACCCCCCCGGCGACATCGACCAGCTG ***** ** ** ***** *** * *** ******* ** **** ** **

Mn12262 AGGCTCAAGCTCCGCGCCAATCCCCTGCTAGACTGGACCTTGAGAAAGGACCAGGGCTGG Mm10095 AGGCTCAAGCTCCGCGCGAACCCCCTCCTCGACTGGACCTTGAGAAAGGACCAGGGCTGG Mm99049 AGGCTCAAGCTCCGCGCGAACCCCCTCCTCGACTGGACCTTGAGAAAGGACCAGGGCTGG Mn11037 AGGCTCAAGCTTCGCCCTCAGCCCCTGCTCGACTGGACCTTGAGAAAAGACCAGGGCTGG Mn10106 AGGCTCAAGCTCCGCGCCAATCCCCTGCTAGACTGGACCTTGAGAAAGGACCAGGGCTGG Mb07020 AGGCTCAAGATCAAGGCCGACCCGCTGCTCGACTGGACCCTGAGGAAGGAGAACGGGTGG ********* * * * ** ** ** ********* **** ** ** * ** ***

Mn12262 TCGATGATTGGCCGCGACGGCAACAACTACGGCAACTGCATCGTGTTCCCCCAAGGCGAC Mm10095 TCGATGATTGGCCGCGACGGCAACAACTACGGCAACTGCATCGTCTTCCCCCAGGGCGAC 201

Mm99049 TCGATGATTGGCCGCGACGGCAACAACTACGGCAACTGCATCGTCTTCCCCCAGGGCGAC Mn11037 TCGATGATTGGCCGGGACGGCAACAACTACGGCAACTGCATCGTGTTCCCCCAAGGCGAC Mn10106 TCGATGATTGGCCGCGACGGCAACAACTACGGCAACTGCATCGTGTTCCCCCAGGGCGAC Mb07020 TCGATGATGGGCAACGACGGCAACAAGTACGGCGACTGCATGCCCTTCCCCAACGGCGAC ******** *** *********** ****** ******* ****** * ******

Mn12262 CACGAATGCAACATAGCTCTAGGCTGGAACCGCCTCGTCACGCGCAGGAAGTTCCGCTGC Mm10095 CACAGCTGCAACATCGCTCTGGGCTGGAACCGCCTCGTCACGCGCAGGAAGTTCCGCTGC Mm99049 CACAGCTGCAACATCGCTCTGGGCTGGAACCGCCTCGTCACGCGCAGGAAGTTCCGCTGC Mn11037 CACAGCTGTAACATCTCC---GGCTGGAACCACCTCGTCACGCGCAGAAAGTTCCGCTGC Mn10106 TACGAATGCAACATAGCTCTGGGCTGGAACCGCCTCGTCACGCGCCGGAAGTTCCGCTGC Mb07020 GTCGAGTGCCCGCAGCCGCTCGGCTGGAACCACCTCGTCCTGCGCAGGAAGTTCCGCTGC * ** * ********** ******* **** * ************

Mn12262 CTGACGAGATTCACGGCCAACGACCTCAAC---GGCGTCAGCCGCA--AGCGCGACGACG Mm10095 CTGACGAGGTTCACGGCCGACAACCTGAAC---GGCGTCAATCGCA--AGCGCGCGGACG Mm99049 CTGACGAGGTTCACGGCCGACAACCTGAAC---GGCGTCAATCGCA--AGCGCGCGGACG Mn11037 CTGACGAGGTTCACGGCCGACGAGGTCGCC---GGCAGAACCCGCA--AGCGCGCAGACG Mn10106 CTGACGAGATTCACGGCCAACGACCTCAAC---GGCGTCAGCCGCA--AGCGCGCCGACG Mb07020 CTGACCAAGTTCACCGCGGACGACCTGAGCATGGTCGCGAGCCGCTCTATCTCAATCCTG ***** * ***** ** ** * * * * * * *** * * * *

Mn12262 G------GGACGACGAGGCCCCGCCGATGACTCTAGAGCAGTACATCG Mm10095 G------CGACGACGAGGCCCCGCCGATGACTCTGGAGCAGTACATCG Mm99049 G------CGACGACGAGGCCCCGCCGATGACTCTGGAGCAGTACATCG Mn11037 GCGCGGACGCAGAGGAAGGGGCTGGCGAGGCACCGCTCTTGACTCTGGAGCAGTACATTG Mn10106 G------GGACGACGAGGCCCCGCCGATGACTCTAGAGCAGTACATCG Mb07020 ATCCGCATGCCCAGAGACCGGCCAG-GATGCACTATCATGGAAGACACCACCATGAGTC- * ** ** * ** * * *

Mn12262 AGAAGGAGCACGGCACGTCTCC--GGCGATCAACTTGGT----CGGCAAAGCGTTTAACG Mm10095 AGAAGAACCACGGAACGTCTCC--GGCGATCAACTTGAT----TCCCAGGACTTTCAACG Mm99049 AGAAGAACCACGGAACGTCTCC--GGCGATCAACTTGGT----TCCCAGGACTTTCAACG Mn11037 AGAAGGAGTTCGGCACGACTCC--GGCGGTGAACTTGGT----CGCCAGAGAGTTCAAAG Mn10106 AGAAGGAGCACGGCACGTCTCC--GGCGATCAACTTGGT----CGGCAAATCGTTCAACG Mb07020 AGCAGGACCACGGTGCTGTGCCAAGACATGAAGCACCGCACAGCACCACGGGGCCGGGAA ** ** * *** * ** * * * * **

Mn12262 CTATGC-CAGACTCCACGACG--GAGCAAGTCTTGCTGAGCCTGAGCGG----GATGCTG Mm10095 CGATGC-CAGACTCCACGACG--GAGCAAGTCCTGCTGAGCTTGAGTGG----GATGCTA Mm99049 CGATGC-CAGACTCCACGACG--GAGCAAGTCCTGCTGAGCTTGAGTGG----GATGCTA Mn11037 CAATGC-CAGACTCCACGACT--GGCCAGGTCTTGCTGAGCCTGGGCGG----GATGCTG Mn10106 CTATGC-CAGACTCCACGACG--GAGCAAGTCTTGCTGAGCCTGAGCGG----GATGCTG Mb07020 TAGCACTCGGACACTGTGATGTTGCTCAAGCTTTACCTACCCTGCACAACAGCAACCCCG * * *** * ** * ** * * * * * ** * *

Mn12262 AGGCCAGCGCCAGCACCAGTTCCGGAAGCGAACACCACCATCGCTGCAACGGACAACGCG Mm10095 ACACCAGCGCCAGCGCCAG------AAGCGAGTCCCACCAGCGCTACAATAGACAACGCG Mm99049 ACACCAGCGCCAGCGCCAG------AAGCGAGTCCCACCAGCGCTACAATAGACAACGCG Mn11037 AGGCCGGCGCCGGCGCCAG------CGGTGAGCAGCACCACCGGCTCTGCTCCTGACACG Mn10106 AGGCCAGCGCCAGCACCAGTTCCGGAAGCGAACACCACCATCGCTGCAACGGACAACGCG Mb07020 AAACCCGAACGAACGGCTGTCGGATCGTGACAAATAGCCACCACTACTACGGCAGCGGAG * ** * * * * * *** * * *

Mn12262 CCAGCGACG------202

Mm10095 CCAGCGACACCTAATAGAGGACGCCACGGGTTACCACCAGCAAGTGAGCTGCCAGAAACT Mm99049 CCAGCGACACCTAATAGAGGACGCCACGGGTTACCACCAGCAAGTGAGCTGCCAGAAACT Mn11037 CCCGCCACA------GCAACCAGCAATAAGCTAGAGGT------Mn10106 CCAGCGACG------Mb07020 CCGTCGCCA------TCACCGGCGCC------** * *

Mn12262 ------ACCGACTTCAACGACAAG--CTCGTGCCAGGCCGGCCGGTGCACATGATCAGAA Mm10095 ACGTGTACCGACCGCAACGACAAG--CTCGTGCCAGGCCAGCCTGTGCATATGATCAGAA Mm99049 ACGTGTACCGACCGCAACGACAAG--CTCGTGCCAGGCCAGCCTGTGCATATGATCAGAA Mn11037 -----TACTGAGCACAACGACAAG--CTCATCCCGGGCCAGCCTCTGCACACGGTCAGAA Mn10106 ------ACCGACTTCAACGACAAG--CTCGTGCCAGGCCGGCCGGTGCACATGATCAGAA Mb07020 ------ATCTACGACCACGCCAAGAACGCCTGCCAGGGCCGGAGCGGCAACCG--CGGTG * * * *** **** * * * ** ** * * *** * * *

Mn12262 CCTTGACATTGTCAGATGC-CGTCGATGAAGAAGAGACCCTATCATTACTGCG-CGACAC Mm10095 CCCTGACGTTGTCAGACGC-CGTCGATGGA---GAGACCCTATCATTACTGCG-TGACCT Mm99049 CCCTGACGTTGTCAGACGC-CGTCGATGGA---GAGACCCTATCATTACTGCG-TGACCT Mn11037 CCCTGACGTTGCCAGATGC-TGCCAGAGAA---GAGAGCCT------ACTGCG-CGAGCC Mn10106 CCTTGACATTGTCAGATGC-CGTCGATGAAGAAGAGACCCTATCATTACTGCG-CGACAC Mb07020 CTCTGCAAGGGTTGGTTCCGCGCCGCCCGGC--GACGTCCCGCCGGTTCTGCGTCAACAA * ** * * * * * ** ** ***** *

Mn12262 CCCGTCCACGACTGGCCAGGATGATGGTGATAACCCAGGGGCTCGCCGCGTGCTCGCCCG Mm10095 CCCGTCCGCGACTGGCCAGGATGATGACAACGACCCAGGACCTCGCCGCGTGCTCGCCCG Mm99049 CCCGTCCGCGACTGGCCAGGATGATGACAACGACCCAGGACCTCGCCGCGTGCTCGCCCG Mn11037 GCCGACCGCAAGTAACCG---CCATGGAGACAGCCAAGAGGCTCGCGACGTGCTGGCTCG Mn10106 CCCGTCCACGACTGGCCAGGATGATGGTGATAACCCAGGGGCTCGCCGCGTGCTCGCCCG Mb07020 CCTGGCCGAAGCGCATCGAG-TTCTTTCGTCAACCAACGAGAACACCCAGCAGGGGTTCG * * ** * * ** * * * * * **

Mn12262 CGTCCATGTTCTGCAGCTCCAGCCTACGGACCGGCCGATCGAGGACCGCTTCTCGATACG Mm10095 CGTCCATGTCCTGCAGCTCCAGCCTACGGACCGGCCGATCGAGGACCGCTTCTCGATCCG Mm99049 CGTCCATGTCCTGCAGCTCCAGCCTACGGACCGGCCGATCGAGGACCGCTTCTCGATCCG Mn11037 CGTCCATGTCCTGCAGCTCCAGCCCACGGACCGGCCGATCGAGGACCGGTTCTCGATCCG Mn10106 CGTCCATGTTCTGCAGCTCCAGCCTACGGACCGGCCGATCGAGGACCGCTTCTCGATACG Mb07020 A---CCTCTACGACGGTGACTGCCTGCGGAAC-----TTTGAGAG--GCTCAACGACCTG * * * * * * * *** **** * * *** * * *** *

Mn12262 CCTGACGACGCGCTCGGGCGCCAGCATCG------Mm10095 CCTGACG---CGCTCGGGCACCACCGCCGCCTCCGCCGCTGCTGGTACCACTACTACCGG Mm99049 CCTGACG---CGCTCGGGCACCACCGCCGCCTCCGCCGCTGCTGGTACCACTACTACCGG Mn11037 CCTGACG---CGGTCGAGTACTGCTGCTGCCGCTGCCGCCGTCACCAACACCACCGGTGA Mn10106 CCTGACGACGCGCTCGGGCGCCAGCATCG------Mb07020 TGCGATTCCACGATCCTCGGCCAGCCGGGGTACACCAGAGGCGACCACCTCTGGAACTCG ** ** ** * *

Mn12262 -----ATCCT----GGGAC-GCA----TCTCGCAA--GGC----CGGCGACGGGCACAGG Mm10095 TCATGATCCT----GCAAC-GCA----TCTCGCAG--GGC----CAGCAGCGGCGGCCGG Mm99049 TCATGATCCT----GCAAC-GCA----TCTCGCAG--GGC----CAGCAGCGGCGGCCGG Mn11037 CGCCAGCAGTAGCAGCAACAGCAGCGGTCCTGCAACAGGTGCCGCGGAGCCGCCTGCCGG Mn10106 -----ATCCT----GGGAC-GCA----TCTCGCAA--GGC----CGGCGACGGGCACAGG Mb07020 GGCTGGTTCT--TCAAGACCGGG---CCATTGCCAGTGGT----AGACGACGCCGGCAAA * ** * ** ** ** *

203

Mn12262 GA---GGAGCAGCAGGACGGGGGTCCGCCTGCAGCTCGG-CGTGTACGACGGGCACCGCG Mm10095 GACGGGAAGGAGCAGAACGAGCATCCGCCTGCAGCTCGG-CGTGTACGACGGGCACCGCG Mm99049 GACGGGAAGGAGCAGAACGAGCATCCGCCTGCAGCTCGG-CGTGTACGACGGGCACCGCG Mn11037 GACGCGGAGAAGCAGAACGCGGGTCCGCCTGCAGCTCGG-CGTGTATGACGGGCACCGCG Mn10106 GA---GGAGCAGCAGGACGGGGGTCCGCCTGCAGCTCGG-CGTGTACGACGGGCACCGCG Mb07020 GAC--ACACCATCGGAACGAGCAGCAGCAGGCGTCACGGGCAAGCAAGGAACCAACCATG ** * * * * *** * * ** ** * *** * * * * *** *

Mn12262 GGTCGTGGGCCGCGCAGCACGTCGCGGCCACGCTGCCGCAAAGGCTGGATGATGTAGAGC Mm10095 GGCCCTGGGCTGCGCAGCACGTCGCGGCCACGCTGCCCGAGAGGCTGGATGGTATAGACT Mm99049 GGCCCTGGGCTGCGCAGCACGTCGCGGCCACGCTGCCCGAGAGGCTGGATGGTATAGACT Mn11037 GGCCGTGGGCCGCGCAGCACGTCGCGGCCACGCTGCCCGCAAGGCTGGATGGTATAGAGC Mn10106 GGTCGTGGGCCGCGCAGCACGTCGCGGCCACGCTGCCGCAAAGGCTGGATGATGTAGAGC Mb07020 TCCTCAGGAGAATGCGCGCCGTCATCAAGGGAGACCCGCAAGCGGCAGCCGTGATAGTGT ** ** **** ** * * * ***

Mn12262 TGGCTTCCTCTACTGCAGAAGCAGGCCCTGTCTTGTACGAGGGAGGGAAATCGGCGGCGG Mm10095 TGGCCTCCTCGGCGGC------GGCGGCAG Mm99049 TGGCCTCCTCGGCGGC------GGCGGCAG Mn11037 TGACTTCTTCTTCTTCGGCGGCGGGCCTTGATTCGGAACATGG------AGGAGTCG Mn10106 TGGCTTCCTCTACTGCAGAAGCAGGCCCTGTCTTGTACGAGGGAGGGAAATCGGCGGCGG Mb07020 TGTTCATTATATCCCTTATGGACCCCCACGTCCTGAGTGGTGT-----CATAGATGGCTG ** * * *

Mn12262 CAGTGGCGGGGCATCCGGAGGCCGTTGAGGAGCGTGTCATTTCGGCATTTCAGGCACTCG Mm10095 CAGCAGCAGGGGATCCGGAGGCCGTTGGGGAGCGCTTCGTTTCGGCGTTTCGGGCTCTAG Mm99049 CAGCAGCAGGGGATCCGGAGGCCGTTGGGGAGCGCTTCGTTTCGGCGTTTCGGGCTCTAG Mn11037 TGGCAGCAGGGGGGTTCGGGGGTATTGAGGAGCGAGTCGTTTCGGCGTTTCGGGCCCTCG Mn10106 CAGTGGCGGGGCATCCGGAGGCCGTTGAGGAGCGTGTCATTTCGGCATTTCAGGCACTCG Mb07020 CGACCGCGGTGCGAGTGAAGAGTGC-AGCGAGCGTATGACCAAGCCGCCTATG-----CA ** * * * ***** * * * * *

Mn12262 ACGATGAGATGCTGGAGAGTTTTACGACGGCTGACAACCCGGACCTACTTGTTTCGAAAG Mm10095 ACGGCGAGATGCTGGACAGCTTCACGAGGGCTGAAGGCCCGGATCTGCTTGCTTCAAAGG Mm99049 ACGGCGAGATGCTGGACAGCTTCACGAGGGCTGAAGGCCCGGATCTGCTTGCTTCAAAGG Mn11037 ACGCCGAGATGCTGAACAGTTTTACGAGGGCTGAGAGCCCGGACCTGCTTGCTTCGAAAG Mn10106 ACGATGAGATGCTGGAGAGTTTTACGACGGCTGACAACCCGGACCTACTTGTTTCGAAAG Mb07020 ACGTCCGACGGTTGTGGGCTGTTACATCGCCTGTGGAACCCTCGCTAACGAC--CAAGAA *** * ** * ** * *** ** ** * *

Mn12262 GTGGCCAGCGTCGCCGTCGGCTGCTCGG--GCTTGTAG--GCCTTGGTTCGAAGAAAGAT Mm10095 GTGACCAGCGTCGCCGGCGTCTGCTCGG--GCTTGTAG--GCCTTGGTGCGAAGAAAGAT Mm99049 GTGACCAGCGTCGCCGGCGTCTGCTCGG--GCTTGTAG--GCCTTGGTGCGAAGAAAGAT Mn11037 GTGGTCAGCTTCGCCGCCGACTGCTCGG--GCTTGTAG--GCCTGGTTGGGAAGAAGGAT Mn10106 GTGGCCAGCGTCGCCGTCGGCTGCTCGG--GCTTGTAG--GCCTTGGTTCGAAGAAAGAT Mb07020 GCTGCTACCGGCATTTCCGTATGCGTGGTGGTTCGCTGTGAGGTTGACAATATTGATGAT * * * * ** *** ** * * * * * * * * ***

Mn12262 CAGGGAAAGGAAGGGCAGGACAA--ACCGAATGTACTGGATGACG-ACCTGACAGCATTG Mm10095 CAGGGAAAAGAAGGGCGAGACAA--ACCAAGTGTTTTGGATGACG-GCCTGCCAGCATTG Mm99049 CAGGGAAAAGAAGGGCGAGACAA--ACCAAGTGTTTTGGATGACG-GCCTGCCAGCATTG Mn11037 CAGGGGCAGGAACCTCGAAACAA--ATCGGACATGCTAGAGGACA-GTCTACCGGCGTTG Mn10106 CAGGGAAAGGAAGGGCAGGACAA--ACCGAATGTACTGGATGACG-ACCTGACAGCATTG Mb07020 TATCTGTGCGGTGGTTGTGGCAATGGCGGAGTGGACAGGAGGAGGTGGTCGTTGGTGCGC * * *** ** ** * 204

Mn12262 CGAGCTGTCAGCGGGTGCACGGCGT-GCCTGCTGCTCCTCGACTGGGACCTCGACGACCT Mm10095 CGCGCTGTTAGCGGGTGCACGGCGT-GTCTGCTGCTCCTCGACTGGGACCTCGACGATCT Mm99049 CGCGCTGTTAGCGGGTGCACGGCGT-GTCTGCTGCTCCTCGACTGGGACCTCGACGATCT Mn11037 CGAGCTGTAAGCGGGTGCACAGCCT-CTCTGCTGCTACTGGACTGGTACCTCGACGACCT Mn10106 CGAGCTGTCAGCGGGTGCACGGCGT-GCCTGCTGCTCCTCGACTGGGACCTCGACGACCT Mb07020 GAGGCAGTCCTCCAGTGCTCGGTGAAGCCTGTAGCT----AACATGCACAAGACGGGCGT ** ** * **** * * *** *** ** * ** * *

Mn12262 CGAGAAGGA---GATTGTCCAACATCCAG-CTTCCACTGCTGATGTTGGCCTCGCGGGAC Mm10095 CACGAAAGA---GACTGTCGAAGATCCAA-CTCCCACTGATGAAGTCGGCCTCGTGGACC Mm99049 CACGAAAGA---GACTGTCGAAGATCCAA-CTCCCACTGATGAAGTCGGCCTCGTGGACC Mn11037 CGAGAAGGATGAGACTGTCGAACATCTCA-CTCCCACTGTGGCCGTCGGGATCGTGGACC Mn10106 CGAGAAGGA---GATTGTCCAACATCCAG-CTTCCACTGCTGATGTTGGCCTCGCGGGAC Mb07020 GGACAATGCGGCTGGCTTTCGACATCTCGTCTCGCACTTCCACCCAGCAGCGCCCAGCAC ** * * * *** ** **** * * *

Mn12262 CCAGGATCGCGACGGCGGCGG---CGGCTCGCACCGCCTGTCGG-TGCGCCCTGATTAAC Mm10095 CCAAGACCACGGCGGCAGTGGAGGCGGCTCGCACCGCCTGCCGC-TGCGCCCTGATCAAC Mm99049 CCAAGACCACGGCGGCAGTGGAGGCGGN------CCTGCCGC-TGCGCCCTGATCAAC Mn11037 CCGGGAGCACGGCGGTGGCTCACACTCCTCG------CTGCTAC-TGCTGCTTGATCAAC Mn10106 CCAGGATCGCGACGGCGGCGG---CGGCTCGCACCGCCTGTCGG-TGCGCCCTGATTAAC Mb07020 CCACACAAACACCACTCAAGACGACACCATGGCCGAACCACAGCACTCTGCGTTTTTCAC ** * * * * * * * * **

Mn12262 CTGGGAGACTCGCGAGCCGTCGTAGCCGACTTGCACAACGGCGA------Mm10095 CTAGGCGACTCGCGAGCCGTGGTGGCCGACTTGCACAACAGCGA------Mm99049 CTAGGCGACTCGCGAGCCGT------GACTTGC------Mn11037 TTGGGAGACTCTCGAGCCGTGGTCGCCAACTTGCACAACGACGATGACGACGACCACCAC Mn10106 CTGGGAGACTCGCGAGCCGTCGTAGCCGACTTGCACAACGGCGA------Mb07020 GCGCCTGCCTCTCGAGATCAGGGTCCAGATCTACGATATGTGCACGCTGG------* *** **** * * *

Mn12262 --CCACCATCACCCCGACCCGTCGTCGGCTGCCATCCAAACA---CCCAGACCAGACTTG Mm10095 --CCGCCATCATCCCGACCCGTCCTCGGCTGACACCCCTACAAAACCCAAACTAGACTTG Mm99049 ------TG Mn11037 CACCACTACCATCCCGATCCGTCGTCAGCCGCCGCCGCCCCGACGCCGCCACCACCGCTG Mn10106 --CCACCATCACCCCGACCCGTCGTCGGCTGCCATCCAAACA---CCCAGACCAGACTTG Mb07020 -ACCACTTGCAGAGAGCCACACGATCCGTCGCGCCCGTGTCA--GGCGCCACC-TACCCA

Mn12262 GTGC------GGCAAACCAGCGACGTCAATTCCTCGGTAGCCTCCGAGCGGGGG Mm10095 CTGCT------GCGACAAACCAGCGACGTCAACTCCTCAGTGGCCTCTGAGCGGGCG Mm99049 CTGC------GACAAACCAGCGACGTCAACTCCTCAGTGGCCTCTGAGCGGGCG Mn11037 CTGCCCGCGGCGCTGCGACAGACCAGCGACGTCAACTCGTCGGCGGCGTCCGAGCGGGCT Mn10106 GTGC------GGCAAACCAGCGACGTCAATTCCTCGGTAGCCTCCGAGCGGGGG Mb07020 CCGTC------GCCGAATGAGAACACGCAGCTCATCAC--GTTTGAGAGCCACGT * * * * ** ** ** ** * * ****

Mn12262 CACATCCTGCGGCAGCACCCTCTGGATGACCCACGCGATGTCATCGTCGGCGGGCGGCTC Mm10095 CACATCCTGCGGCAGCACCCGCTCGACGACCCACGCGACGTCGTCATCGGCGGGCGGCTG Mm99049 CACATCCTGCGGCAGCACCCGCTCGACGACCCACGCGACGTCGTCATCGGCGGGCGGCTG Mn11037 CACGTCCTGCAGCAGCACCCCCTCGACTACCAGCGCGACGTCGTCGTCGGCGGCCGGCTC Mn10106 CACATCCTGCGGCAGCACCCTCTGGATGACCCACGCGATGTCATCGTCGGCGGGCGGCTC Mb07020 CGAGTTCCCGCCCTGGCTCCTGTGGACATGCCGCCAGATCAGAGCGGAGATGGCCG---C 205

* * * * * ** * ** * * ** * * ** **

Mn12262 TTTGGCGAGACGCTCTCTACGCGAGCCTTTGGGGACGC-GCACTACAAGTTGCCAGTCCG Mm10095 TTTGGCGAGACGCTCTCCACGCGAGCCTTTGGGGACGC-GCACTACAAGCTGCCGGTCCG Mm99049 TTTGGCGAGACGCTCTCCACGCGAGCCTTTGGGGACGC-GCACTACAAGCTGCCGGTCCG Mn11037 TTCGGCGAGACGCTCTCCACGCGGGCCTTTGGGGACGC-GCACTACAAGCTGCCGGCGCC Mn10106 TTTGGCGAGACGCTCTCTACGCGAGCCTTTGGGGACGC-GCACTACAAGTTGCCAGTCCG Mb07020 AGTGGCG----GCTCAGTACATACGCTTTGGTGTGTGCTGCACGTCTACGTGCGGTCACG **** **** ** ** ** * * ** **** * * *** *

Mn12262 CGAGC--GCA-----GAGAAG------AGCAGCAGTTGGACCACCGGCA-AGTCGA Mm10095 CGAGCCCGCG-----GGGGAGCAGCGGCGGCAGCAGCAGCTAAATCACCGGAA-ACTCGA Mm99049 CGAGCCCGCG-----GGGGAGCAGCGGCGGCAGCAGCAGCTAAATCACCGGAA-ACTCGA Mn11037 GGCGCCGGCGCCGCCGGCTCGCGGGGATGGCTACACCTGGTTACCTATCAGCCTACTCGG Mn10106 CGAGC--GCA-----GAGAAG------AGCAGCAGTTGGACCACCGGCA-AGTCGA Mb07020 ACCGCTGGTA------CGACGGACGGCCGGTTGTGA--CTTGG ** * * * * * *

Mn12262 CGGCG---GGGGAAACCC---CGGC--GTGGTCGCAGGGACAA---GACTATCGGCCCGG Mm10095 CGGCGAGGGGGGAAAACCAGTCGGC--ACGGTCCCGTGGGCGGCGCGACTATCGACCCAG Mm99049 CGGCGAGGGGGGAAAACCAGTCGGC--ACGGTCCCGTGGGCGGCGCGACTATCGACCCAG Mn11037 ACGCCTCTAGCTCATACTGGCCGCCCGGCTGGAACTTCGCCCGCTTCGCTAACTCCACTG Mn10106 CGGCG---GGGGAAACCC---CGGC--GTGGTCGCAGGGACAA---GACTATCGGCCCGG Mb07020 CCGTG----GCGCTGCCCGCCAGGCACGCTGCCACGTCAACTGCATCTCGAGTGGCGCCT * * * * * * * * * * * *

Mn12262 GAGAGGAAGCTACATCGGCACTACGTCGAGCGGATGAGCGCGCGCCTGCTCGCTGCCACG Mm10095 GAGAGGAAGCTGCACCGGCACCACGTCCAGCGGATGAGCGCGCGGCTCCTCGCTGCCGCG Mm99049 GAGAGGAAGCTGCACCGGCACCACGTCCAGCGGATGAGCGCGCGGCTCCTCGCTGCCGCG Mn11037 CCGAGGAGTATGCCTCG-CTCTCTGACGAGGAGCGCAATAAGATGCATGCTGGTTTCAGA Mn10106 GAGAGGAAGCTACATCGGCACTACGTCGAGCGGATGAGCGCGCGCCTGCTCGCTGCCACG Mb07020 CCCGCGCGTGCTCGCCG-----AGAACATTAGGCTGAGCAGATACTTCGGGGTCGTCGAG * * ** * * * * *

Mn12262 CAGGCTGGCACTCGCGAGAAAGTCGTGCGAGACGGCGACGGTGATGGTGGTGACGGTACT Mm10095 CAGGCCGACGCCCCAGCGCAGGCCGCGCGGG------Mm99049 CAGGCCGACGCCCCAGCGCAGGCCGCGCGGG------Mn11037 GATGCGCTCGGCCAGGACGGCTTCGC-CGAGCTATCCGAAATGATGTTCCAGGAGAAAGT Mn10106 CAGGCTGGCACTCGCGAGAAAGTCGTGCGAGACGGCGACGGTGATGGTGGTGACGGTACT Mb07020 GATGCGTTTGTTCGCCGCCAGGTCCTCCAGCC------* ** * * *

Mn12262 GCCCCAGCAGCTACCACACCACCGCAGCCTCGCCACGGCGCGACGAGGACGCTGACGCTC Mm10095 ------CGAGGACGCTGACGCTC Mm99049 ------CGAGGACGCTGACGCTC Mn11037 ACGTATTATG------AGAGAGGAAGA-GAGGCGA Mn10106 GCCCCAGCAGCTACCACACCACCGCAGCCTCGCCACGGCGCGACGAGGACGCTGACGCTC Mb07020 ------GGGAGAGAGCCGGTACCT *** * * *

Mn12262 GAGGAGCGGTACGAC-GCCATGTTCTCGTCGTACCACACGCCGCCGTA-CGTCTCGGCGG Mm10095 GAGGAGCGGTACGAG-GCCATGTTCTCGTCGTACCACACGCCGCCGTA-CGTCTCGGCGG Mm99049 GAGGAGCGGTACGAG-GCCATGTTCTCGTCGTACCACACGCCGCCGTA-CGTCTCGGCGG Mn11037 CAGAAGCAATTAGCCAGCGGTAGTACTGCCCCAATACCGCTCACTGAAGCGCCCCAGTGG Mn10106 GAGGAGCGGTACGAC-GCCATGTTCTCGTCGTACCACACGCCGCCGTA-CGTCTCGGCGG 206

Mb07020 CGTCGCCGACCGGAT----GCTCTCCGAGCTGACGACGCTCCGCGTCA----CCCTGCGC * * * * * ** * * * * * * *

Mn12262 CTCCGGATGTCCAG------GTATGGTCGGAG-GCGGCG-----CCCAGGTCTTCCGTAG Mm10095 CTCCGGATGTCCAG------GTATGGTCGGAG-GCGGCGGCG--CCCGGGTCTGCTGCAG Mm99049 CTCCGGATGTCCAG------GTATGGTCGGAG-GCGGCGGCG--CCCGGGTCTGCTGCAG Mn11037 CTCAGGACGTGGAGAGAAAAGTACCGCGACCA-GCCATGGGGGTTTCTGGCCTTCTACCT Mn10106 CTCCGGATGTCCAG------GTATGGTCGGAG-GCGGCG-----CCCAGGTCTTCCGTAG Mb07020 CACCACGCAGGAATCGATCTGCGTGCTCGCAATGCAGTG------CGTGCCCTCTACGA * * * * ** * * * * *

Mn12262 AGGA------CGTC-----GCCGGTGATGATGAACACCCGGACGGCCGCGCTGCG-G Mm10095 AGAAGGATGAGGGCGTC-----GCCGGTGGTGGTAATCTCCCTGACGGCTGCGCTGCG-G Mm99049 AGAAGGATGAGGGCGTC-----GCCGGTGGTGGTAATCTCCCTGACGGCTGCGCTGCG-G Mn11037 GTCTCCTCCTCCTCCTCCTCCCGCCGCAGTTGGTACAGCCAGTGGTGATAGCGATGCAAG Mn10106 AGGA------CGTC-----GCCGGTGATGATGAACACCCGGACGGCCGCGCTGCG-G Mb07020 CCGG------GTCGAG--GCTGAGCTCAGCCGCGTCGTCAACCGTTGCACGGAGCA ** ** * * ** *

Mn12262 CTCGTTGCGAAGGTGACGTCGAGGGCA---CCCGAGGGGTCTCAGGTACACGGAGATCGA Mm10095 CTCGTTGCGAGGAGGGCGTTGAGGGCAGCGCCCGAAGAGTGT---GCGTGGGTGGGTTGA Mm99049 CTCGTTGCGAGGAGGGCGTTGAGGGCAGCGCCCGAAGAGTGT---GCGTGGGTGGGTTGA Mn11037 CACCCAGCAGCAACAGCATCAGCGTCAGGAATTCCAGGACACAGTGTGGGAGATGACTGA Mn10106 CTCGTTGCGAAGGTGACGTCGAGGGCA---CCCGAGGGGTCTCAGGTACACGGAGATCGA Mb07020 CGAGAAGCTCCGG-AGCATCGAGATGG------TTGGCATATTCCGCATCCGCTGGCTGG * ** * * * * * * *

Mn12262 TGCCGCGGAGAAGAATGCT------GGGGATTA--TGGCTACGGATGGG---CTCTGGGA Mm10095 AGTCGCGGAGAA---TGCT------GGGGGTTT--TGGCTACAGACGGG---CTCTGGGA Mm99049 AGTCGCGGAGAA---TGCT------GGGGGTTT--TGGCTACAGACGGG---CTCTGGGA Mn11037 GGTGCCATTCAACACCGCTCTCAATGAAGATCACCTAGATGCCAATGAGGCTCGCAGCAA Mn10106 TGCCGCGGAGAAGAATGCT------GGGGATTA--TGGCTACGGATGGG---CTCTGGGA Mb07020 ACGAGCTCGAGCGCAGACTCGGTGGCCGCATCA---GAGTCCTAAGAGG---CACTCCGA * ** * * * * * * * *

Mn12262 TCTCGTGAGCA-GCGAG--GAGGCCGTGGCGTT--TTTGCAACGAA----CTACCA---- Mm10095 TCTCGTGAGCA-GCGAG--GAGGCCGTGGCGTT--TTTGCAACGGA----CTGTCA---- Mm99049 TCTCGTGAGCA-GCGAG--GAGGCCGTGGCGTT--TTTGCAACGGA----CTGTCA---- Mn11037 CTTCGAGATCAAGTGGGTTGAGGTGGTAGCGTCAGCTGATGACGAGGAAGCTACTATACT Mn10106 TCTCGTGAGCA-GCGAG--GAGGCCGTGGCGTT--TTTGCAACGAA----CTACCA---- Mb07020 TAACGCTGGCT--TGGG--CGGACCGGAGCGCAGCCGCCGAGCAGACCGACTATTAC--- ** * * * * * *** * ** *

Mn12262 ---CTGTCGCAGGTCGGGATGGG---GAGAAGAATGGGAAGAAGGAGGT---CGTGAATC Mm10095 ---CAGTCGCTAGTCTGGAGGGGAATGAGAAGAGTGGGCAGAAGGAGGT---TGTGAATC Mm99049 ---CAGTCGCTAGTCTGGAGGGGAATGAGAAGAGTGGGCAGAAGGAGGT---TGTGAATC Mn11037 GGACCGTCTCCGATCCCGGTACCGA-GAGATGGTTGAGCAGCAGGAAATATCTTGGAGAT Mn10106 ---CTGTCGCAGGTCGGGATGGG---GAGAAGAATGGGAAGAAGGAGGT---CGTGAATC Mb07020 GAGAAGATACCAGAGTCGTTCAACCGGACAGCGGCTACCAAAGGCAGGCCTTTTCAAGGT * * * ** * * * * *

Mn12262 TTGCTGAGGCGCTGCTGAGGCATGTCGTCGAGGAGAAAGGCAGGAAGCCTG-----GCGA Mm10095 TTGCAGGGGCGTTGCTGAAGTATGTTGTCGAGGAGAAAGGCGGGAGGCCTG-----GCGA Mm99049 TTGCAGGGGCGTTGCTGAAGTATGTTGTCGAGGAGAAAGGCGGGAGGCCTG-----GCGA Mn11037 TCGCCCAAACGCTTTTC---CTTGTCGTAAACGTAAGTAGTGTTAGCTCTGTATTAGTAG 207

Mn10106 TTGCTGAGGCGCTGCTGAGGCATGTCGTCGAGGAGAAAGGCAGGAAGCCTG-----GCGA Mb07020 CAGCCAGAGCCTCGCGA---TCTATCACCAA-GAGATGAGCGGCAGGGA------GCAC ** * * * * * * * * *

Mn12262 TGACATAACTATCCTGG-TTGTCGAGTACG------ATGTTTCGGGC- Mm10095 CGACATCACGGTCCTGG-TTGTCGAGTACG------ATATTTCGGGC- Mm99049 CGACATCACGGTCCTGG-TTGTCGAGTACG------ATATTTCGGGC- Mn11037 TGCCCGAATCGACTAGACTTGGCAAGTACGGG------CGACCAGATGCGCCGTTC- Mn10106 TGACATAACTATCCTGG-TTGTCGAGTACGATGTTTCGGGCCGGCTAGATGTAATGTCCA Mb07020 CGTTACCGCGACCCTGCACT-TTGATCATC------CATGGCAGCC- * * * * * * *

Mn12262 ------CGGCTAGATGTAATGGCTCGTGCCCAGCAGGGCAGTATTCAAGCCATGCAAA Mm10095 ------CGGCTGGATATAATG------CAAA Mm99049 ------CGGCTGGATATAATGTCCCGTGCCCACCAGGGCAGCAGTCAAACTATGCAAA Mn11037 ------CTGCTCGCCGTGGAAAC-CGTGGACGGCGAGATCGACCCCAGGGACGAAGAG Mn10106 AAGACTCCTCGATTGGCATCATGGCTCGTGCCCAGCAGGGCAGTATTCAAGCCATGCAAA Mb07020 ------CCAGCTTCCACG------CGGC------CCAT * *

Mn12262 ACCCCGAAACCCCGTTCCTCAGCCGCCTTCCAG-TCGAGCTC----CGCCTCCAGATATA Mm10095 ACCTCAAATCTTCATTCCTCAGCCGCCTTCCAG-TCGAGCTC----CGCCTCCAGATATA Mm99049 ACCTCAAATCTTCATTCCTCAGCCGCCTTCCAG-TCGAGCTC----CGCCTCCAGATATA Mn11037 GCCGAGCGAGTTCGGTACGCGCCGGTGTTCAAGGTCGCTATCGAAGCGCTTGTGGAAGAG Mn10106 ACCCCGAAACCCCGTTCCTCAGCCGCCTTCCAG-TCGAGCTC----CGCCTCCAGATATA Mb07020 CTGCGTCTTTCCCGTAAGCCATCGGCACACGTGCTCGTCGAC----TACTTGGGCAATGT * * * * * * *** * * * *

Mn12262 C---GAAATCTACATGCTCAGTTATCTCGAGACAGCGAAGT----ACACACTAGCACCAC Mm10095 C---GAGGTCTACATGCTCAACCACCTCCAGACAGCGATGT----ACACGCTAGCACCAC Mm99049 C---GAGGTCTACATGCTCAACCACCTCCAGACAGCGATGT----ACACGCTAGCACCAC Mn11037 CTCTGGCATTCGGTCGAAGGGCAGACCCTAAACATGGAAACCCTGACACGGTGGGTGCGT Mn10106 C---GAAATCTACATGCTCAGTTATCTCGAGACAGCGAAGT----ACACACTAGCACCAC Mb07020 C-----GATCCAACTGC--GACCAGCGGCGAGCAGGGACTGAAGGAAGCAACAGAGGCCA * * * * ** ** * * * *

Mn12262 TAATCCCACCACGCAT-GCCCAAC--CCAGGCGAAACTAGCCGGGGCCGCCGGACCCGCC Mm10095 TCATCCCACCACGCAT-GCCCAAC--CCAGGCGAGACGCGCCGAGGCCGCCGGACCCGTC Mm99049 TCATCCCACCACGCAT-GCCCAAC--CCAGGCGAGACGCGCCGAGGCCGCCGGACCCGTC Mn11037 GAAGCACGGCTGGTAGAGGATAGTGACGAAGAGAGCCAAACCGATGACGACCTCTACGAT Mn10106 TAATCCCACCACGCAT-GCCCAAC--CCAGGCGAAACTAGCCGGGGCCGCCGGACCCGCC Mb07020 GGCTCGGGACCTGGGC-GAACAAAGCACAGACTACGGCCGCCGCAACCGCTGCGGGCTTG * * * * * * * *** ** *

Mn12262 GCTGCGAGAAC----CATCTTATCGGCTTCGAGGTCGTGGGCGGGGACTGTCCGCCACTG Mm10095 GCTGCGAGAAC----CATCTCATCGGCTTCGAGGCCTCGGACGCGGACTGCCCGCCAATG Mm99049 GCTGCGAGAAC----CATCTCATCGGCTTCGAGGCCTCGGACGCGGACTGCCCGCCAATG Mn11037 ATTTGGTGGACTGCGCATGCTCCTAGTCGTCATATCAGGAAGAAGAGAGGACTCGAAATG Mn10106 GCTGCGAGAAC----CATCTTATCGGCTTCGAGGTCGTGGGCGGGGACTGTCCGCCACTG Mb07020 ACAGCGGCCATGGTGCAGCAGGCTCTTTCGGAGGCC---AACCTCGTCCACATCATCATG * * ** * * **

Mn12262 CTGCGAG-TATGCCGACAGGTCAGGGCCGAAGTTTCGCCCATCGCCGCGCAGTACCTGCG Mm10095 TTGCGAG-TATGTCGGCAAATCAGGGCCGAGTTTGCGCCCATCGCCGCGCAGTACCTGCG Mm99049 TTGCGAG-TATGTCGGCAAATCAGGGCCGAGTTTGCGCCCATCGCCGCGCAGTACCTGCG 208

Mn11037 CCGCCCTCTGACCCGACAACTCTCATCTTCATCATCGCCGAACATCTCAC--CATCAGTA Mn10106 CTGCGAG-TATGCCGACAGGTCAGGGCCGAAGTTTCGCCCATCGCCGCGCAGTACCTGCG Mb07020 CCCCCTTCTGATCCAGCAGCTCTCGTCCTCATCATCGCAGAACATCTCAC--CATCAACA * * * ** ** * *** * * * * * * *

Mn12262 CTTCATTGTTTCGTCCGAGGCGTCGGTAAACTGGGAAGAAGTGCTTACATATCCCGTCCC Mm10095 CTTCATTGTGTCTTCCGAGGCTTCGGTAAACTGGGAAGAAATACTACCTCATCCCATCCC Mm99049 CTTCATTGTGTCTTCCGAGGCTTCGGTAAACTGGGAAGAAATACTACCTCATCCCATCCC Mn11037 CTCCTCTACCCTCATCATCCCTCAATTCCCCTAGCATAAAGCTTATCACCACCACGTCAG Mn10106 CTTCATTGTTTCGTCCGAGGCGTCGGTAAACTGGGAAGAAGTGCTTACATATCCCGTCCC Mb07020 CCTCCCTCCCACAATCACCCATACAGTCACTCAGCGCGCAGCTCATCAACACCGCACAAG * * * * * * * * * *

Mn12262 GCTCACGATGCCGCCCCGCTGCCCGC----CCGGCGCGATCGTGCAGCATCTCCATCTCG Mm10095 GCCCACGATGCCGTCTCGCTGCCCGC----CCGATGCGATCGTGCAGCATCTTCATCTCG Mm99049 GCCCACGATGCCGTCTCGCTGCCCGC----CCGATGCGATCGTGCAGCATCTTCATCTCG Mn11037 AAACTATGGCCAGCCCCGTTGCCTTCTCCGCCATTGAGGATGACGACTACAGCTACCTTT Mn10106 GCTCACGATGCCGCCCCGCTGCCCGC----CCGGCGCGATCGTGCAGCATCTCCATCTCG Mb07020 CCATCATGTCTTCCCCCATCGCCTTCTCCGCCATTGAGGACAACGACTACAGCTACCTGT * * *** * ** * * * * * **

Mn12262 AGTGG---CGCCTCAACGGCCACACTCCGCCGC-----ACCGGT-CAACGAATATG--CA Mm10095 AGTGG---CGTCTGAACGGCCACACTCCGCCGC-----ATCGAT-CGACAAACATG--CA Mm99049 AGTGG---CGTCTGAACGGCCACACTCCGCCGC-----ATCGAT-CGACAAACATG--CA Mn11037 ACTATATCCGCCAGAATAACTCCATTGCGGTGCTGAAAAGCAACACGACTCATGAGGGCA Mn10106 AGTGG---CGCCTCAACGGCCACACTCCGCCGC-----ACCGGT-CAACGAATATG--CA Mb07020 ACTATGCCCGCAAGAATGGCTCCATCGCGGTGCTGAAGAGCAGCACGACGCAGGAGGGCA * * ** ** * ** ** ** * * * ** * * **

Mn12262 GCGCTACA---TGTCGATAATCTGGCATCGGTTCGAGGTACACCAGTACCAGGGCACTGG Mm10095 GCGCTACA---TGTCGATTATCTGGCATCGATTCGAGGTACACCAGTACCAGGACGCTGG Mm99049 GCGCTACA---TGTCGATTATCTGGCATCGATTCGAGGTACACCAGTACCAGGACGCTGG Mn11037 ACGACACAGTGTACAAAGCTTCCAGCGTCATCCTCTCAGGAAACACCGTCAGCACGCTTT Mn10106 GCGCTACA---TGTCGATAATCTGGCATCGGTTCGAGGTACACCAGTACCAGGGCACTGG Mb07020 ACGACACCAAGTACACCCCTACGAGTGTCATCGTCTCAGGCAACACGGTCAGCACCTCGT ** ** * * * ** * ** *** *

Mn12262 AAGGTTCCATGCGGCGGGTTCATTTTCGCTGCTGTCTAGGCTGTCGGC-GATACGAATC- Mm10095 ACGGTTCCATGCGGCCGGTTCCTTCTCACTGCTGTCTAAGCTATCGAC-GATGCGGATC- Mm99049 ACGGTTCCATGCGGCCGGTTCCTTCTCACTGCTGTCTAAGCTATCGAC-GATGCGGATC- Mn11037 CTCCAAACATCAGCGCCGTATCTTACAAGGACGGGCAAGGCAACCGCCAGGTCCGCATCT Mn10106 AAGGTTCCATGCGGCGGGTTCATTTTCGCTGCTGTCTAGGCTGTCGGC-GATACGAATC- Mb07020 CCACGAACATCAGTGCTGTGTCGTACAAGGATAACAACGGAAATCGCCAGGTTCGCATCT *** * ** * * ** * * * ** ***

Mn12262 ---ATTCTAGTGCCATCGAGGCAGATGAGTGGTGTC------GGCGGCACAAATT--CG Mm10095 ---ATCCTGGTGCCATCGAGGGAGATGAGTGGTGTC------GGCGGCACAAATT--CA Mm99049 ---ATCCTGGTGCCATCGAGGGAGATGAGTGGTGTC------GGCGGCACAAATT--CA Mn11037 ACTATATTGGCACGAGCGAGACCGAAGGCTGCCAGCTCAGGGAGCTGGTCCAGACCAACG Mn10106 ---ATTCTAGTGCCATCGAGGCAGATGAGTGGTGTC------GGCGGCACAAATT--CG Mb07020 ACTACATCAGCTCGGCCGAGACCGGCGGCTTCCAGCTCAGCGAGCTGGTGCAGACCAACG * * * **** * * * * * ** ** * *

Mn12262 ATGGTGAATAATCTCTGCATACG---GACAA------ACGGGCAGCTAGATGTCC- Mm10095 ATGATGAATAATCTCTGCATACG---GACAA------ACGGGCAGCTAGATGTCC- 209

Mm99049 ATGATGAATAATCTCTGCATACG---GACAA------ACGGGCAGCTAGATGTCC- Mn11037 GCGGCGAGTTTACCCAGGGCGAGCTCGACAACAACACCCTGAAATGCAGCGAGACTTCCC Mn10106 ATGGTGAATAATCTCTGCATACG---GACAA------ACGGGCAGCTAGATGTCC- Mb07020 GCGGCGAGTTCACCCAGGGCGAGCTCGACAACAACTCTCTGGCCTGCGGCGAAAACTCCC * ** * * * * * ***** ** ** * * ***

Mn12262 ---TTGTCGCCATGTGCGAGCGGTGCCCGGAG---CTCAAGAAGCTC------GAGTTCG Mm10095 ---TCGTCGCCATGTGTGAGCGGTGCCCGGAG---CTCAAGAAGCTC------GAGTTCG Mm99049 ---TCGTCGCCATGTGTGAGCGGTGCCCGGAG---CTCAAGAAGCTC------GAGTTCG Mn11037 TCCTCAGCGCCAATGTCGAGTTTGGCAAGGGAGATCTCAAGATTTTTTACCAGGACACCA Mn10106 ---TTGTCGCCATGTGCGAGCGGTGCCCGGAG---CTCAAGAAGCTC------GAGTTCG Mb07020 TGCTGAGCGCCAACGTCGAGTTCGGCAAGGGCGACCTCAAGATCTTCTACCAGGACACGC * ***** *** ** ** ******* * **

Mn12262 TGGGGCTTTTTGAAAAAGCTTGGCTGGACACGGTGGAGGAGAGGATGAGGGCACGAGAGG Mm10095 TGGGGCTCTTTGAAAAGGCTTGGCTGGACGCAGTGGAGGAGAAGATGATGGCCCGGGAGA Mm99049 TGGGGCTCTTTGAAAAGGCTTGGCTGGACGCAGTGGAGGAGAAGATGATGGCCCGGGAGA Mn11037 AGGGCTTCCCCTACGTCGCCTGGGTTG-TGCTGGGGCAGACAGCTTGGACTTCTCACCCG Mn10106 TGGGGCTTTTTGAAAAAGCTTGGCTGGACACGGTGGAGGAGAGGATGAGGGCACGAGAGG Mb07020 GCGGCAACCCCTGGGTCGCGTGGGTCG-TGCTCGGCCAGACCGCCTGGGCCAGCCACCCG ** ** *** * * * * ** **

Mn12262 TGAAAGTGTTTC------GTGGCAAGGTGCTTCCTGGACAGGGGCAGCAAGAGTC--- Mm10095 TCAAAGTGTATC------GGGGCAAGGTGCTTCCGAGACAGGGG---CAAGAGCC--- Mm99049 TCAAAGTGTATC------GGGGCAAGGTGCTTCCGAGACAGGGG---CAAGAGCC--- Mn11037 CTCAAGCCTATCCCTTTCAAGTGGCAATTCATCTTGTGGATAGGCGC-GCGACTGTCCTC Mn10106 TGAAAGTGTTTC------GTGGCAAGGTGCTTCCTGGACAGGGGCAGCAAGAGTC--- Mb07020 CTCAAGCCTGTCCCCTTCAAGTGGCAAATGGCCTCTCAGGACAGTGACTCCAGCG----- *** * ** * ***** * * * * * * *

Mn12262 -TGATCAGCCTCACATGATACCATGGC--CTGTTTTTGAGGATT-GCCCAATG------Mm10095 -AGATCAGCCTCACATGATGCCATGGC--CTGTTTTTGAGGATT-GTTTAATGC----CA Mm99049 -AGATCAGCCTCACATGATGCCATGGC--CTGTTTTTGAGGATT-GTTTAATGC----CG Mn11037 CAGCCCAATGCCACCGTGCACCATGGCAACAGAAAGCGAAAGCAAGCTCTGTGT----CG Mn10106 -TGATCAGCCTCACATGATACCATGGC--CTGTTTTTGAGGATT-GCCCAATGCATCTCG Mb07020 -AGGAGGACGT----GGGCATCCTCCCAGCAACTCTTGC--CCTCGCCCGGCA------* * * * * * *

Mn12262 ------Mm10095 CCCATCATGGCCCCGC---AGGACCGCGACACGAGCGATGACGACAATGGAACCCCCCCG Mm99049 CCCTCTG------ATCCAGCATCTCTC Mn11037 GCGAGCCCTTCACCTT---CCCCAATGGCACCACGATTCGAAACCGGCTCGTCAAGTCCG Mn10106 GAAAGCACCAACTTGAGAACGTGCAGGGCTTCGGGCTTGACGTTCAGGAGGTCGCAGTGG Mb07020 ------

Mn12262 ------GATTCAAG-TCAACTATCTCAGCC------CGCCAC--A Mm10095 ATCAGACTGGCTGCGGCCAGGCTTGAC-TCTGCTTTCATGGCCACATGGTCTGCCATCGA Mm99049 ATCCTCATCATCGCAGAACATCTCACCATCAACAATCCTCTTT-CTTCCCAACTTACTGA Mn11037 CCATGGCAGAAGGTTACGCAACCAAGGATCTCC--TCCCAAACGACGGATACGCCACTAA Mn10106 AGATTGGCGAACATCTCACCATTAACAATCCTCTCCCATCTTCACTCATCAGTTCACTAA Mb07020 ------GGACCCCGTCTTCATGGCCAAATGGACTGCCATCGA * * * *

Mn12262 TTC------ACTA-AGCACAAAGCTC----ATCAAC------GCTACG--- 210

Mm10095 TCCCAAGCCACGATACTACAACAAAAAGGTCCCAAATCAGTACTCCAGCAAGCTGCG--- Mm99049 TTC------ACCA-AGCACTAAGCTC----ATCAAAAC------AAACCACG--- Mn11037 CTACGCGAGATGGGCCCA----AGGGGGCTGGGGGATGGTGATTACAGGGAACTACATGG Mn10106 GCACAAAGCTCATCAACGCTACGCAAGTAATCATGACTTCCCCTGTCGCCTTCTCCGCTG Mb07020 GCCCAAGCCACGCTCCTACAGCAAGAAAGTCCCCAACCAGTACTCGAGCAAGCTGCG--- * * *

Mn12262 -CAAGTA------ATCATACTTCCCCTGTCGCC------Mm10095 -CAAGCTCCCCGAC---ACCCAGCTCCACTACGCCCTCCTCGATGCCCAGGAGTACTGGA Mm99049 -CAACCA------TGGGAAACCCTGTCGCC------Mn11037 TCGACCCCAAAGGC---ACTGCCGGCCCCGGCGTGCTGCTCGTCAACAACCCCAAGGTCC Mn10106 TCGAAGACAACAACTACAGCTACCTCTACTATGCTCGCAAGAACAGCTCAATTGCTGTTC Mb07020 -CAAGCT------GCCTCACGGCCAGC------* * *

Mn12262 ------TTCTCCG-- Mm10095 TAGAAATCCTCGCAGACCACTGTCCGCCCGCCTGGGTCGGCATCGCCAGTGTCCTCCAGC Mm99049 ------TTCTCCG-- Mn11037 CGCACGACGTCCAGGTCGCCAAGATCGCAGCCTTTGCCCGCGCAGCCCAGGCCAACGGCA Mn10106 TAAAGAGTAACACGACACAAGAGGGCAACAACACAAAATACACACCCACGAGCGTCATCG Mb07020 ------TTCACTA-- *

Mn12262 ------CTGTCGAAGACAACAAC------TACAGCTATC Mm10095 ---AGCTTGCCACCATCATCCCCATGTCTGCCGCAGCCAGCACCAGGCCCTACATCGCCC Mm99049 ------CCGTCACGGACGACAAC------TACAGCTACC Mn11037 CCAGGGTCCTCGCGCAGCTCTGCCACGCCGGCCGCGCCGGCATCATCGGCGGCGGCGGCG Mn10106 TCTCGGGAAACACCGTTAGCTCGAGTTCCACAAACATCAGCGCTGTGGCTTATAAGGACA Mb07020 ------

Mn12262 TCTA------CTATGCTCGCA------AGAACG--G Mm10095 TCAAGGCCGCGCTCAAGAAGGCCATCCATGCTAGCACACAAGCGGGCTGGAAGGACGATG Mm99049 TGTA------CTACGCTCGGG------AGGACG--G Mn11037 ACCGGTCGCTGTGCGAAAAGAACATGGCCCCCAGCGCTGTCCCCTTGTCCATTGGCGACG Mn10106 GCCGCGGCAACCGCCAGGTCCGTGTCTACTATATCGGCACCGCTCCAACTGGAGACTTCC Mb07020 ------

Mn12262 --CT--CAATTGCTGTTCTG------AAG---AGTGA Mm10095 --CTGACATCCGCTGTCCCGCTGTGCCACTAGCGCTCTTATACGCAGTCAAGGTCAATGG Mm99049 --CT--CTATTGCTGTGCTG------AAG---AGCAG Mn11037 GCCTAGCCGCGCGTGTCCTGTCGTCCC--TGGTGTTTGGCTCGCCGAGGGAGAT-GATGA Mn10106 AGCTCAGGGAGCTGGTCCAG----ACCAACGGCGGCGATTTCACCCA--GGGCCAGCTCG Mb07020 ------CGCCCTCCTCG------*

Mn12262 CACGACGCAA-GAGGG------CAACAACACGAAATACA------Mm10095 CAGTGCGCAG-AATGGAGCTGGAGACATGCCCCCCAAGATCACTGAAGACATCGCCGCCG Mm99049 CACGACTGTT-GAGAA------CAACAAGACTACGTACG------Mn11037 CCGAGGATATTGAGAGTGTGATTACCAAGTTTGCTAGCGCGGCTAAGATTATGGCTGACG Mn10106 ACGAAAATTCTCTGCCCGCCAGCGAGAACTCCCTCATCAGCGCCAATGTCGAGTTTGGCA Mb07020 ------ACGCCCAAGAGTAC------*

211

Mn12262 ------CA---CCCACGA-GCGTCATCGTGTC--GG------GAAACAC Mm10095 CTCGCCTTGACTGGCTCA---CCCACGA-GC-CTAGCGAGTC--CGTCATTCCGAAATGC Mm99049 ------AG---TCTGAGA-ACATTCTCGCCTC--CG------GCAACGT Mn11037 CTGGGCTCGACGGGATCGAGCTGCACGGTGCGCACGGCTATC--TG-CTCTCGCAGTTCC Mn10106 AGGGCGACCTCAAGATATTCTACCAGGATACGCACGGAAACCCCTGGGTTGCGTGGGTCG Mb07020 ------TGGATCGAGATCCTCGCCAACC------ATTTCC * *

Mn12262 CGTTAG-C------TCGAGTT------CCACAAACATCAGCGCTGTG-GCTTATAAG Mm10095 TGGCGGTC------TTGGGCCATTCG--CCACAAATATCCTCAC-GCG-GCGTACATT Mm99049 CGTCCG------GAGCACTT----CCAAGGACATAAGTGCCGTG-TCTTGGACG Mn11037 TGTCGGCCGAG-----TCGAACCAGAGGACCGATGAGTACGGCGGCAGCGCAATTAACAG Mn10106 TGCTTGGCCAAACGGCTTGGGCCTCTCACCCGCTCAAGCCTATCCCTTTCCAGTGGCATT Mb07020 CGCCTGCC------TGGGTC------AGTGCTGCC------GACGTC * * *

Mn12262 GAC-AGCAGCG-----GCAACCG------CCAGGTCCGT-----GTCTACTA--- Mm10095 GTCCAACAGCT-----GCCGCCGGCCGAACCTATCCAAGCCCGCCG---GTCTGTCAACA Mm99049 GAT-AAGAGCG-----GCGCCCG------CCAGGTTCGC-----GTCTACTA--- Mn11037 GACGAGATTCGTG-ACGCGGGTGATCAAGGCGGTCCGGGCTGTCGT---GCCCGCTGGCT Mn10106 GCTCAACCTTGGACGAGCCTCGGCGCAAGGAGCTGTGGGGTGACAGTAGATTATCCAGCA Mb07020 CTCCAGCATCT----CGCCACTA------TCGTGCCGATGG---CCTCGGCG--- * ** *

Mn12262 ----TATCGGCACCG------CTCCA------ACTGGAGACT Mm10095 G-GTCCTCGGCACGATCTT---CTTCATGGCATGAAGACCG--GATGAACATCAAAGGCA Mm99049 ----CGTAAGCACTA------AACCG------GGCAGTGGCT Mn11037 TTATTGTCGGGATCAAGCTTAACTCGGTGGATTATCAAACA--GACAGGAGCACCAATGC Mn10106 TGAACATGGTTGCTCAGCTGAGTGACAAGGATTCAGGGACACTGCCGGCCGGCCCGATGA Mb07020 -----GCCAGCACCAAACCATACCACA------CACTCAAGGCT

Mn12262 TCC------AGCTCAGGG------AGCTG Mm10095 TCCCG-CGGCGATTCAGTGTGCCACGGCCAACGAGCCCAGAGCTTGGACAC----AGTCG Mm99049 CCC------AGCTCAACG------AGCTA Mn11037 GACAA-AGACGGGTGTGATGGACAGCGAGAAGCAGCTGCAGGAGTGCATCG----AGCAG Mn10106 AACAGCAGACAAGCGAGGCCGGCAGGTTCGTCGAGTTGCGAAGCTTACCCGCCACCATGG Mb07020 GTC------CTCAAGAAGGCAACGCA---CGCAG *

Mn12262 GT------CCAGACCAACGGCGGCGAT------Mm10095 ATATGCGAGAGGCAACCACAATGACGACCACGATGGAATGCT--GATCGAGGACGACGAG Mm99049 GT------CCAGACCAACGGCGGCAAA------Mn11037 TTCGCCG----CCATCGCGGCTGCTGGGGTCGACTTTGTCGA--GGTCAGCGGAGGTTCC Mn10106 TCTCGTCCAAGATCCTCGGGCTGCTGTTCTCAGCTGCCCTCGCCGTTCAGGCTTATCCGC Mb07020 GC------ACCCACGCCGGGTGGAAGGAC------

Mn12262 ------TTCACCCAGGGCC----- Mm10095 TCTTTCGAACT--TTGGGACGCCTCCTTCCGTGATGAGGAGCTCCGCCAAGGACCTAATG Mm99049 ------TTCACCCAGGGTGACCTC Mn11037 ATGTCGAACCCGGTGATGTCAACCGGCCCGCCCAGGTCCGACCGCACCAAAGCCCGCGAG Mn10106 AGTGTGGGTGCGCAGCCGACGACTGCTTCGCCAAGGCGTCGCTGC-CCGCCGCCTTCCAG Mb07020 ------GACGCCAACCTCCGC--- * ** 212

Mn12262 ------AGCT----CGA------CGAAAACTC-TCTGC------Mm10095 AGCACAAGCTACAGTT----CGAG--CT-CGAAGACTCCTCTTCAAGAGGTT---TCAGA Mm99049 AGCGGAAACTCTATGC----CGTG--TGGCAAAGACTC-CCTAC------TCAGC Mn11037 GCGTTCTTTCTTAGCTTCGCCGAGTCCTTCCGGGCCTCATTCCCCGACGTCCCGCTGATG Mn10106 CCCACCCTGACAGCCTTTTGCTACAACTATCTCGGCTCAGCTGCCGCCACCGAGACAGTC Mb07020 ------TGCCAGCCTGTGCCG------TTGGGCCTCTTGTAC------* *** *

Mn12262 ----CCGTCAGC------GAGAACTCCCTCATCA---GCG-C--- Mm10095 CATGCCGGTAGCTCCCGTTCACTCTCGCCC---GAGGGCTCAGACACCAATCGCG-CGGA Mm99049 GCTTCCATCAAAACCGATCAGAG--CGGCC---TCAAACTCTTCTACCAGGACCA-GGA- Mn11037 GTGACTGGCGGCTTCCACAGCCGGGCGGGCATGGAGGAAGCCGTCTCCGGCGGCGGCACA Mn10106 ACGGCCA-CGGCTAC-GCAGTCA-TCGGCCATCACCAGCACCGTCACCGGCACCGATACA Mb07020 ---GCTGTCAAG------GTGAACGGCAGTGCT--- * *

Mn12262 ------CAATGTCG--AGTTTGGC------Mm10095 ATCTCCAC-----CAATCGATTTCG--ACTTTGCCCGATCGGCTGCCGTGACCCGGAGAC Mm99049 ------CGGGAACCCTCG--GGTCGCGTGGAACGGCGATGCCGGCTGG----- Mn11037 GACCTCAT-----CGGCCTAGGCCGC-GCCTCTGTCGTCAACCCCTCGCTGCCGGCTGAT Mn10106 ACCACCGCTGGCGTGACCGAGACTGTTACCGAGACGGTCACGATCACGCCCACCGCAGAC Mb07020 ------CAAAAGGG------*

Mn12262 ------AAGGGCGACC------TCAAG Mm10095 CC------CTTGTCTCGTCCA------TTAAGACCGACCAGCAAGCAAACGTCTTCAAC Mm99049 ------CTTAGCCATCCGC------TTGGGCACGTCC------CGTTCGAGTGGCAG Mn11037 AT------CGTGCTGAATCGAG-----CCGTTGCCGACGCCGATGCGTTTGTGACGGTC Mn10106 CCAGCGACCATTACTGAGACGGAGACCACTGTCGTTGGCTCCGTCACGACTGTAGCTACC Mb07020 ------

Mn12262 AT--ATTCTAC-----CAGGA-----TACGCACG------GAA Mm10095 ATCAAGTCTATGGA--CGACG-----TGCGCACGCTACTGACCGAACTACAATCCCAGAA Mm99049 AT-GGCCCCACAGGACCGCGA-----CACGAGCGATGACGACAA------TGGAA Mn11037 ACTAAGTTCCGGCCGCCGTGG-----TACCTGAACTTTGGGCCACAGCTGGTCGGACTGG Mn10106 CCCTGCGACACGACGACGACAACAACCATCAGCACGTCCTACTACCGCGTCGTCGACTAT Mb07020 ------

Mn12262 ACC------CCT----GGGTTG--- Mm10095 GCCGGCCCGCTGC-----CCCACTGCCG----TCGAAGTCCTCCCCC----ATGCTGGTA Mm99049 CCCCCCCGATCAG-----ACTGGCTGCG----GCCAGGCTTGACTCT----GCCTTCATG Mn11037 GCTACGAGAATGTGAGTGTTGTCCGCTT----TCACCATATTGCATT----GCTTTGGCA Mn10106 GCCAGCCGCTCTTCGTCGGCCGCGGCCGATAAGCGCTCCGCAGCCCCTGAAGCCCCGGCC Mb07020 ------

Mn12262 ----CGTG-----GGTC------GTG-CTT---GGCCAAA------CG Mm10095 ACGACGTGCAAGAGGCCTTTTCCGCATACACGAAGTA-CTTTGGAGCCAAGGTCAAGGCG Mm99049 GCCACATG--GTCTGCCATCGATCCCAAGCCACGATA-CTAC--AACAAAA------AG Mn11037 GTTCCCTCGTCTCTTTATTCTTCCTTTGGAGTTTGCA-CTTCTGAATGATGACATGGCCT Mn10106 GTGCCCACCGAGCTCTCGCGCGGCTGCAAGGCCAACAGCCTGCCTGCCAAGCTATCGTCG Mb07020 ------213

Mn12262 GCT------TG-GGCCTCTC------ACCCGCT--- Mm10095 GCCATGAAGTTGACGTTGGGGCAGGTTCTGCTG-GCTTCCCC------GCTCGCTAAG Mm99049 GTCCCAAA------TCA-GTACTCCA------GCAAGCTGCG Mn11037 CACAGGCCAGTGACACCAGCGAAGATGACATGG-GAACCCTCCCGGTGAAACTCGCTGCC Mn10106 GCCTGCTCGTGCGTCCTCGCCACAGCGACATCACAAACTGTCACCGTCACGAGCACTGCG Mb07020 ------

Mn12262 -----CAAGC------CTATCCCT------Mm10095 CTCGTCAAGCATTCGGACGCCATGCGGAGAGATCTCGACTATTTCCGCAGCGAGGCCGAG Mm99049 -----CAAGCTCCCCGACACCCAGC------TCCACTACGCCCTCCTCGATGCCCAG Mn11037 ---GCCAGGCGCGATGCCGCCTT-CATGGCTATATGGTCCGCCATTGACCCGAAGCCGCG Mn10106 ACGGCCATCAACACCGATGTCGC-GACCTTCACCGAGACCGTCACCGTCACCACGACAGG Mb07020 ------

Mn12262 ------Mm10095 AGCGTACGCCAACAACTACCCGAGGCTATCAATGTGGTCAAGGAGGCTCACAATCGCGAG Mm99049 GAGTACTGGATAGAGATCCTCGCAGACCACTGTCCGCCCGCCTGGGTCGGCATCGCCAGT Mn11037 GTACTACAACAAAAAAGTCCCAAATCAGTAC-----TCCAGCAAGCTGCGCAAGCTCCCC Mn10106 CGCGGCCACCACTATCGCCACAAACACGGCA---CCCACGGCCAACCCGACGACGGTCAC Mb07020 ------

Mn12262 ------TTCCAG Mm10095 GTCAGGCGGCTGCACAAGGATATTGCCGATCTAAAGGTCGCCGTAAGTAAGACATCCCGA Mm99049 GTCCTCCAGCAGCTTGCCACCATCATCCCCATGTCTGCCGC------AGCCAG Mn11037 AGCACTCAGCTTCACTATGCCCTCCTCGATGCTCAGGAGTACTGGATCGAGA--TCCTTG Mn10106 CACAATCACCACCAGCACCTCCTTGTCCACCATCGAGACCTGCGTCTCGACT-ACCACCG Mb07020 ------CG

Mn12262 TGGCAG------Mm10095 CAGCAGGCCACCAAAAAGCCACAGCCCGAGGCCGATGCGTCGGCGAGCCGGT-ACCAAAT Mm99049 CACCAGGCCCTACATCGCCCTCAA------GGCCGCGCTCAAGA-AGGCCAT Mn11037 CGGACCACTCCCCTCCCGCTTGGGTCAGTATCGCTGGTGTCCTCCAACAGCTCGCCACCA Mn10106 TTGTAGTAACCTCGACCACCACGACCGACTTCACATGCGGCACTCCCTCGCCCACAGTCC Mb07020 CGGGGGAC------

Mn12262 ------ATGGATTCAAG------TCAA-----CTATCT Mm10095 CACCACGAGAGCTGCTGCTGCACGGGTTGGAGACCCAAG----GGTCAA-----CCGTCA Mm99049 CCATGCTAGCACACAAGCGGGCTGGAAGGACGATGCTGA----CATCCG-----CTGTCC Mn11037 TCATCCCCATGGCTCCAACGGCTAGCACCAAGCCCTACAATGCGATCAAGGCCGCCCTCA Mn10106 TCGG--CATTGGCTACGTCACCTGCCCTGACCCTGCGGATCCCGAGGAGACGGGTGTTCT Mb07020 ------ATGCCTCCCAA------AA-----TCACCG *

Mn12262 C------AGCC--CGCCACATTCAC Mm10095 CGAGCGCAGCAAGCTAGACGATCCTCCGGCCATGAGCACGAGAGCC--CGACAAATGCAA Mm99049 CG------CTGTGCCACTAGCGCTCTTATACGC--AGTCAAGGTCAA Mn11037 AGAAGGCTATCCACGCCGGTACGCACGCGTTATGGAAGGATGATGC--CGACCTCCGCTG Mn10106 CTCTCGCATTGAGGGCAACGACTCCGAGGGCACTATCTTCGAGGCCTGCGTCAACGTCCG 214

Mb07020 A------AGACCTTGCCGCGGCCAA * * * *

Mn12262 --TAAGC------ACAAAGCT------CATCAACGCT-ACG---- Mm10095 GATAAGCCGGATGCCAGCAACGCTTCCCGCAAGAGGGGCCCGCCCCAGTACC-ACAGCAT Mm99049 TGGCAGTGCGCAG--AATGGAGCT------GGAGACATGCCCCCCAAGATC-ACTGAAG Mn11037 TCCAGCTGTGCCGCTGGCGCTCTTGTACGCGGTCAAGATCAACGGCAGTGCTCAGAATGG Mn10106 GGCCGGTGACCTCACGACCCCCTCGGGCGGCACCCACAAGTGCGACGGCACCAACAATGG Mb07020 GCTCGAC------TGGCTCATCCAGG------ATCCCAGCACT------*

Mn12262 ------CAAGTAA------Mm10095 ATCAGGTCCATCGCGGAAGCGGGTTGTAACAAGCAAGAAGGCGGCGA------Mm99049 A-CATCGCCGCCGCTCGCCTTGACTGGCTCACCCACGAGCCTAGCGA------Mn11037 AGCCGGAGACATGCCCCACAAGATCCTGGAGGACATCACCGCCGCCC------Mn10106 CGCCAACCCCGTCCCTGGGGGCACCAGCACGCGCGCTCTCGCAGCCTCGGCCGCCCTCGA Mb07020 ------TCAGTCATCCCAA------

Mn12262 ------T------CATACTTCCC-- Mm10095 ------ATCTGAAGACG-----GTATACGCCT------CATACTTTGCGG Mm99049 ------GTCCGTCATTC-----CGAAATGC-T------GGCGGTCTTGGG Mn11037 ------GCCTCGACTGGCTTATGCAGGAGCCTAG------CACGTCTGACGT Mn10106 CGGCTTCGACTTTGACGGTACCTGGGACAACTCTTTCGATGACTTCTTCATCACCCGCAT Mb07020 ------GATGCTGGAGG-

Mn12262 ------CTGTCGCC------TTCTC Mm10095 ATGACGACATGGATTCCACTAAAGATGTGGCTCTTGCCAGCCCAACCTTGA--TGTCATT Mm99049 CCATTCGCCACAAATATCCTCACGCGGCGTACATTGTCCAACAGCTGCCGC--CGGCCGA Mn11037 TCCAAAATGCTGGTGGACTTGGGCCATTCGCCATAACTACCCTCACGCCGA--GTACATT Mn10106 TTCCAGCACTGCCCAAACAAGCAGCCAGTTCTGGGGCCTCCTCATCAACGACCAGTTCTC Mb07020 ------ACTTGGGCCATTCGTCACAACTACCCGCACGCAGC--GTACATC

Mn12262 CGCTGTCGAAGACAACAACT---ACAGCTATCT---CTACT------ATG- Mm10095 CCCTCACAAAGACTTTGACTTTGGCGGCACTCT---CGATTCTCGCACGCTGTTCCATG- Mm99049 ACCTATCCAAGCCCGCCGGTCTGTCAACAGGTC---CTCGGCACGAT--CTTCTTCATG- Mn11037 GCCCAGCAGCAGCCTGCAGTCCAACCCGGCCCA---AGTCGCCGTTCTATCAACAGACC- Mn10106 GCCCGTCGGCGGCTGCCAGTCCCAGCTCGCCCAGGGCGACCGCAACCTGTGGGCGTACGA Mb07020 AGCCACCAGCAGGCACCCGCGCAACCTGCACAG---AACCGCATATCCGTTCACAGGAC- * *

Mn12262 ----CTCG-CAAGAACG------GCTCAATTGCTG-TTCTGA------Mm10095 ----CAGA-CAAGATCGTCAGCA-CCAAGGCAAGCTTTGAAGCTG-TTCTGGCATCCACG Mm99049 ----GCAT-GAAGACCG------GATGAACATCAAAGGCA-TCCCGC------Mn11037 ----CTCGGCAAGGTCGCCATCA-CAACAGGAAGACAGGATGAAC-ATCAAGGGCATCCC Mn10106 CTCCTTCAACAAGGCCAACTTCCTCCAGGTCACGCCGGGCTACTCCATCGTGCGCGCCGG Mb07020 ----GTCGGCCAGATCACCTTCG-CAATTCGAGGATCGCATGAAC-ATCAAAGGCATACC ** * *

Mn12262 ------AGAGTGACACGAC-----GCAAGAG------Mm10095 CTCG---CAGTCACAGAATGGAAGGATA---TGCAAAAGCTAAGGAGTGGTACATCTCGC Mm99049 ---G---GCGATTCAGTGTGCCACGGCC---AACGAGCCCAGAGCTTGGACACAG----- Mn11037 CCGG---CGGTT-CAGCGTGCCACGGCC---GACGAGCCCAGAGCTTGGTCACACCAAA- 215

Mn10106 CTCGGCCCCGTCCCAGCAGGTCCAGGTCATTGACGCGCCCACCGGCCGCCCCGTCTCCGG Mb07020 GCGG---CGGTT-CAGCGTGCCTCGACC---CACAAGCCCAGAGCTTGAGTACGGGCGA- ** * * *

Mn12262 ------GGCAACA------ACACGAAA------TACACACCCA- Mm10095 CGCTTGCCCATTAATCAGCACCACTTGCGTGACCGGCACGGTACTGTCCTGTACAGCCAG Mm99049 ---TCGATATGCGAGAGGCAACCACAA--TGACGACCACGATGGAA---TGCTGATCGAG Mn11037 ---TATTCGAGGGACGAGCACTACCCC---GACCATGACGGAA------TGCTAATCGAG Mn10106 CGCCTCCATCGCGGGCACAATCACCGACGCCAACGGCATGGCCGAGATCAGCGTGCCCGC Mb07020 ---CACGAGAGAAACGAGCACTACGGT---GACCATGACGGAA------TGCTCATCGAA * * * *

Mn12262 -----CGAGCGTC------ATCG---TGTCGGGAAACACC----- Mm10095 CTGTTCGAATGTTTCGGACTCCGGGAT-CTAAGTATGA---TGCCGAGGATCGCCAGCCT Mm99049 GACGACGAGTCTTTCGAACTTTGGGACGCCTCCTTCCG---TGATGAGGAGCTCCGCCA- Mn11037 GACGACGAGTCTTTCGAGCTATGGGACCCCTCTTTCCG---CGACGAGGAGCTCCGTCGA Mn10106 CCGCCCGGGCTGCTACCAGTACAAGGCCACGCGGGCGGACTCGCTACGCAGCAACGCGTT Mb07020 GACGACGAGTCCTTTGAGCTCTGGGACGCGTCCTTTCG---TGACGAAGAGTTGCAAGAA ** * * *

Mn12262 ------G Mm10095 CCTTGCCAATATCCA----TCCAATGCAGACTCCCAATCAGAACTATTTGTACCGGAAGG Mm99049 ------AGGACCTA----ATGAGCACAAGCTACAGTTCGAGCTCGAAGACTCC-----T Mn11037 GCACCCAGAGAGCACAAGCCACAAATCGAACTCGACGACTCTCCAGCGAGAGATTTCAGG Mn10106 CTACCTGACCAATTTTAATTGTTCTTCATACACGCGGGGCTGCTTCTTGTCGTCGGTGAG Mb07020 GCATCTCCCGAGCCCAAACCAGACCTCGAATTTGACAATCCCCTAACCCGGAGCTTGAGG

Mn12262 TT------AGCTCGAGTTCCACAAACATCAGCGCT------Mm10095 TT------TGCACAAACCTCAAGACCATCAAGGCCCATGGTT-CCGCCTTCCATTGGTCT Mm99049 CT------TCAAGAGGTTTCAGACATGCCGGTAGCTCCCGTT-CACTCTCGCCCGAGGGC Mn11037 CG------TGCCGTTAGCTTCCGCTCATCCTCGCCTGAGGGC-TCTGACACCAATCGTGC Mn10106 CGACGCGGTGCGCTGGTGCAGCTACGATCTCGGCAAGGAGGCGCCCGTCACGCAGTGTCT Mb07020 CA------TGCCCAGAGCCTCCGCTCGCCCTCGCCTGCAGAG-TCCGACAACTATCGTGC *

Mn12262 ------GTGG------Mm10095 TCCAATGCCAACCTACAGGCTGATGAACT---TCAGGTGGAGGCTTT------ATG Mm99049 TCAGACACCAA------TCGCGCGGAATCTCC------ACC Mn11037 AGAAGCTCCACCCATCGACTTTGACCTTG---CCCGATCGGCTGTTGTGTCCCGGAGACC Mn10106 GTTCACGGCGACGAGCACGTCTGCGCCCGTGACCAGCATAGTCATCACGACCGCCACCCG Mb07020 GATCACGCCAGCCGTTGACTTTGGCCTCG---TGCAGTCGGCAGCAGCCCACCGCAGAGC

Mn12262 ------CTTATAAGGACAG--CAGCGG--CAAC------CGCCAGGTCCG Mm10095 GGCGCT-CGCCCTCATCGGAACTG--CAACGGA-CGGCTTCGGTGTCCTCGCCAGGGCCC Mm99049 AA------TCGATTTCGACTT--TGCCCGATCGGC------TGCCGTGACCC Mn11037 TATTGT-CTCGTCCATTAAGACTGACCAGCAAGCAAATGTCT---TCAACATCAAGTCCA Mn10106 GATCGTATGCATTCTTGAGGCCAGGGAGGCACGCCGGCGCAGCGTTGATGATGACCTGCC Mb07020 CCCTGT-TGCGTCCACTAAGACTGACCAACAAGCGAATGTCT---TCGACATCAAGTCTA * *

Mn12262 TG------TCTACTA-TATCGGCACCGCTCCAA------Mm10095 TGATAGACTTGCTCAACGA-GACTGGCCTCACTTGGATCATGCTTCCTTCTGAGATCCAG Mm99049 GGAGACCCCTTGTCTCGTC-CATTAAGACCGACCAGCAAGCAAACGTCTTCAACATCAAG 216

Mn11037 TGGACGACGTGCGC-ACAC-TATTGACCGATCTACAGTCTCAAGTACCAGTCCGCTGCCC Mn10106 TG--CAGCGACGGCGACTCCTCTTGCCGCAAACGCAGTCGGGGTGGGGAGCTTCCTCTTG Mb07020 TGGACGACGTGCGC-ACGC-TACTCACCGACCTCCGCTCTCAGAAGCCCCTCCGTTGTCC * *

Mn12262 ------CTGGAGACTT---CCAGCT--CAGGG------Mm10095 AACATTCCT-GATTGCTGGAAATTA---TGGGCTATCAGGGTAGGCTATCTGAGTGCTTC Mm99049 T------CTATGGACGA---CGTGCGCACGCTACTGACCGAACTACAATCCC Mn11037 CAC------GGCCGTCGAGGTCCT---CCCTCATGCAGGCAACGACGTG------C Mn10106 CTCCTCCTAAGGTCACCTAGACCATTGCCCACCGTGCTTGCGCCGCCGGAAAGCAACGTC Mb07020 AACT------GCAGCCGAAGTTCT---ACCGGACGCCGACAGTGACGTG------C *

Mn12262 ----AGCTGGTC------CAGACC------AACGGCGGCG-- Mm10095 ACCGGACCAGTCTGC--CACGGACCCCAACAACCAGCAGTCTCCCGAGAATGGCGGTGCA Mm99049 AG-AAGCCGGCC-----CGCTGCCCC------ACTGCCGTCG-- Mn11037 ACGAGGCCTTTTCCG--CATACACAAGATACTTCAGCGCCAAGGTCAAGGCGGCCATGAA Mn10106 GGCGCGCATGTCAGCAATACGGACCTGAAGCTCTTCGTGCTGAAGATGTCTGGAGACGAG Mb07020 ACGAGGCCTTCGCAA--CGTACACGAGATATCTTGGCGCCAAGGTCAAGGCCGCCATGAA * * *

Mn12262 ------Mm10095 GCTCAGGCCGTCATTGCTCAGGCCTCTTCCACCACTGATGAGGACGCCGATATCATCTTC Mm99049 ------AAGTCCTCCCCCATGCTGGTAACGACGTGCAAGAGGCCTTT Mn11037 GTTGACATTGGGTCAAGTTCTGCTAGCGTCCCCGCTCGCTAAGCTCGTCAAGCAGTCTGA Mn10106 GCTC-CACCCGCGCGGTACTCGCACTTCTTGGCATTGCCGAGGCTGCTCCAGTCTGGTGG Mb07020 GTTGACATTGGGACAGATCCTGCTGGCATCCCCACTCGCTAAGCTCGTCAAGCAGTCGGA

Mn12262 ------ATTTCACC--CAGGG-- Mm10095 CAGGGCTCGAGACAGATCTCTCACCAAGATCATGACCAGCAGGATTTCACCACCAAGGAC Mm99049 TCCGCAT------ACACGAAGTACTTTGGAGCCAAGGTC Mn11037 CGTTATGCGGCGAGATCTCGACTATTACCGCAACGAGGCCGAGAGTGTACGCCAACAACT Mn10106 CTGGACTCATCA--ACTACGATGGCGCCGCCCAGGTCGCGATGGCGGCAATCTGGCCGAC Mb07020 CGCCATGAGGAGAGATCTCGACTATTACCGCAACGAGGCCGAGAGCGTACGCCAGCAACT

Mn12262 ------CCAGCTCG------Mm10095 GAGCATAGCT-CCAGCTTGGAGTCGGTGCCTGTACTCTCCCACGCGCCGCGCAGTCCCGG Mm99049 AAGGCGGCCA-TGAAGTTGACGTTGGGGCAGGTTCTGCTGGCTTCCCCGCTCGCTAAGCT Mn11037 ACCCGAGGCCATCAATGTGGTCAAGGAGGCTCAAAATCGCCAGGTCATGCGGCTGCATAA Mn10106 AGCGAGCATCGGCAGTTCGGTGAGGGCTGGCCAGGACATGCCAGCACTGACAGGTCCGAC Mb07020 GCCGGAAGCCATCCAGACAGTCAAGGAAGTCCACCAACGGGAGGTCACGCGGCTGCACAA

Mn12262 ------ACGAAAACTCTCTGC--CCG------Mm10095 TCTGGCAGCTGCGGCATACTTGGC-----GGGAGACGAAATCTTTAAGC--TCGAAGACG Mm99049 CGTCAAGCATTCGGACGCCATGCG-----GAGAGATCTCGACTATTTCC--GCAGCGAGG Mn11037 GGAAATTGCCGACCTAAGGGCTGC-----TGTGAGCAAGACGTCCCGGCAGCAGGCCAAT Mn10106 TACAGTTGCTCCATTTACAGTAGCATACCGAGAGACGAGTCGCGCAGACTACCAGACCTC Mb07020 GGACATTGCCAATCTAAAGTCCGC-----CGTGAGTAAGACAGCCCGGCAGCAGGCCAGC *

Mn12262 ------TCA--GCGAGA---ACTC------Mm10095 ACGATGCCATGAACTACTACCA--GCTAGATTCACTCGACGACGTGCGCAGGTTGCACTC 217

Mm99049 CCGAGAGCGTACGCCAACAACT--ACCCGAGGCTATCAATGTGGTCAAGGAGGCTCACAA Mn11037 AGAAGGGCTCAGCCTGAGGCCGAGGCGTCGACGAGTCAATACCAGATCACCACGAGAGCT Mn10106 ACAATGTTCTACGAGCCCGGCAAGACATCCCACAATCTGCCCCACGACCCCTTCAAGGCC Mb07020 AGGAAGCCCCAGCTCGAGCCTGAAGCGTCGACGAGCCAGTACCATATCACGACCCGGGCC * *

Mn12262 ------CCTCATCAGCGCCAA------Mm10095 GTCAATGTTCACTCCTAAAATCAATATGCGCGGCTCCATCCGTGCTGACATGACCAAGGA Mm99049 TCGCGAGGTCAGGCGGCTGCACAAGGATATTGCCGATCTAAAGGTCGCCGTAAGTAAGAC Mn11037 GCCGCTGCCCGAGTTGGCGACCCGCAAGTCGACCATCGTGAGCGC-AGCAGAATGGACGC Mn10106 TGCGTCGTGC-CCCGTCCGATCGGATGGATCTCGACCACCAGCGCTCTCAAGCCTGGCGA Mb07020 GCTGCCACGCGTACTGGTGGCTCAGTCGCCGGTCGCCGCGA-TGGTGGCAGGCTCGACGC *

Mn12262 ------Mm10095 GGATTTGATTAAGGCCTTTGAA------GGGTTTCAGAGCTATGTGGAC---CAGAGAAT Mm99049 ATCCCGACAGCAGGCCACCAAA------AAGCCACAGCCCGAGGCCGAT---GCG----- Mn11037 TCCTCCAGCCATGAGTACGAGA------GCGCGCCAAATGCAGATCAAA---CCGAATAC Mn10106 TCCGTCGTCGGCGGCTCAGCACAACCTCGCGCCCTACTCGCAGTTCAACAACCTGACCTT Mb07020 TCCTCCAGCTATGAGCACGAGA------

Mn12262 ------TGTCGAGTTT------Mm10095 CACTCTGGGCATCC---AGGCCGCTGTTGTCCAGTCTCTGA------TGGCCC Mm99049 ------TC---GGCGAGCCGGTACCAAATCACCAC------GAG--- Mn11037 TAGCAACGATTCCCGCAAGAGAGGCTCGACCCAGCATCACG------GGCCAT Mn10106 TGACCCCCCCTACGTCATGTTCAGCGCAAACCAGACCCCCGACCACCAGCGCAAGGACAG Mb07020 ------TCTCGCCAGGTGCATG------* *

Mn12262 -GG---CAA------GGGCGACCT-CAA-G------A------Mm10095 CGG---CAATGA------ATGAGCGAATC-CGG-G------AGGTCATCTACGAC Mm99049 -AG---CTGCTG------CTGCACGGGTT-GGA-G------ACCCAAGGGTCGAC Mn11037 CAGGTCCACCAC------GCAAGCGGGTTGCGACG------AGCAAGAAATCGAC Mn10106 CGTCCGCAACGCCGAGCAGACGGGCAAGTTCTGCTGGAACCTCGCCACCTGGGCCCTGCG Mb07020 -GTGATCCGCGT------GTCGGACAAATT---TCG------* * *

Mn12262 ------TATTCTAC Mm10095 ATGCAGGATCGTGGTCAGAACCCGGTCCCTTCCGCCCGCCGGCGCGCCCGATCACCCGAC Mm99049 CGTCACGAGCGCAGCAAGC------TAGACGATCCTCCGGC Mn11037 CAACCTGAAGACAGTGTACGCTTCGTACTTTGCGGATGACGATATGGTTATTTCTGGCAT Mn10106 CCACAAGGTCAACATCACCGCCGAGCAGGTCCCCTATGGCGTCGACGAGTTTGACCGCGC Mb07020 ------CGC

Mn12262 CA-GGATAC------GCA---CGGAAAC Mm10095 CA-GAATCCCGCATCTCCCAGACAGAGACAGCAGCAATCTGCCGACGACA---CGCTAGC Mm99049 CA-TGA------GCA---CGAGAGC Mn11037 CTCAGAGGTGCATGAGTACCCAGACACACGCGGGAACGCCGGTATCCGCA---TGGAACG Mn10106 CGGGCTGGCCAAGTGCTTCTCGACGACACTGCCGGGCGACGGCGACGGCAATCCGGTGCC Mb07020 AAGAGACAA------TCAATTCAGCATC **

Mn12262 C------CCTGGGTTGCGTGGG------TC------218

Mm10095 CAGGGCCGCCGAGGTTGCCCAGG------TCCACGTCGAGCACAC Mm99049 CCGACAAATGCAAGATAAGCCGG------ATGCCAGCAACGCTTC Mn11037 AATTATCATCATCATCATCATCA------TCATCATCATCATCAT Mn10106 CATGGTGGCCGA-GTCGCCCGTGCGCTTCGAGTGCGTCTACCACTCGACGCTGCGGCTGC Mb07020 AAGAGGTATCGGCGCCCCCTCGC------

Mn12262 ------GTGCTTGGCC-A----AACGGCT---T------GGG Mm10095 CCAGAAGATCGAGGACCTGGAGCGCAGCC-ACGAGAGCAACC---TTCGGCAGCTGCGGG Mm99049 CCGCAAGA------GGGGCCCGCCCCAGTACCACAGCA---TAT------CAGG Mn11037 CATCATCATC----ATCATCATCATCATCATCATCATCATCA---TCATCGTCACAAGCA Mn10106 CTGGGAACCCGCCCATGGGGTCTGTCGACATTGTCATCGGCCGCGTCGTGGGCGTGCACA Mb07020 ------AAGCGAGTCGCG------ACTGGCAA------GAAGT *

Mn12262 CCTCTCAC------C Mm10095 ACGCCCACGAGG--CCAGGATGCAGCAGGTCAGG-----GACGCCCAGGAGGCCGCGACC Mm99049 TCCATCGCGGAA--GCGGGTTGTAACAAGCAAG------AAGGCGGCGAATCTGAAGAC Mn11037 CAGTCAACTACA--CAAATATACGACAA-CAACA-----GCCACTTCTTCCCCATCACAT Mn10106 TCGCCGACGAGGTGCTGACAGACGGGAAGCTCGACATTCGCAAGACGGAACCCATTGCGC Mb07020 CGGCCGATCTGA------AGAC

Mn12262 CGCTCAAGCCT------ATCCCTTTCC------Mm10095 CGCGCAGCCCTCGCGCGTATCGAGCGCAGGTCCTACTCCCGCAAGCGCACATTCGATCCA Mm99049 GGTATACGCCTC------ATACTTTGCG------Mn11037 CTTTCCACTCTCAT---CGCATCTTCTGAAGCTCAAGCCTCAGCGTAAATTATCCTTGTA Mn10106 GCTGCGGGTACTACGAGTACACAGTGGTGCGCGAGACTTTCGACATGGTGATTCCGGGCA Mb07020 GGTGTACGCC------ACGTACTTTGCG------

Mn12262 ------AGTGGCAG------ATGAAGACCA Mm10095 ------GAGGAGTTTGGACACGTGGTGAAGCGTCGGCGATCG----GACATGAAGACCA Mm99049 ------GATGACGAC------ATGAAGACCA Mn11037 ------CGCCTCTCCCCCAAAACAACCCAACAACACCG--CA----AAGATGAAGGCCA Mn10106 TGAACGAGGCCACGCTTGCGGGGCTCGAGGGCAGCACCAGGCGGCACAAGGCGTTGGCGG Mb07020 ------GATGACGA------AATGAAAGTCG * *

Mn12262 ACTTCGCTGCCATCC-TCCTCCTCGCCTCCGTTGCCGCCGCGACCTCCTCTACGACGGGC Mm10095 ACTTCGCCGCCGTCC-TCCTCCTCGCCTCCGTTGCCGCCGCGACCTCCTCTACGACCGGC Mm99049 ACTTCGCCGCCGTCC-TCCTCCTCGCCTCCGTTGCCGCCGCGACCTCCTCTACGACCGGC Mn11037 ACTTCGCCGCTGTTC-TCCTCCTCGCCTCCGCCGCCGCCGCGACTTCCTCTACAAGCGGC Mn10106 AAGGGGAGGACGCAGATGGAGGAGGCGAGTGTCGCCAGCACGAGCACCCCGCCGGACGTC Mb07020 ACATCGCCACCATCT-TGTTTCTCGCCGCCATAGCTGTAGTGACCTCCCCCTCGAATGTT * * * ** ** ** ** * * *

Mn12262 CAG--GTCGAG----GAGCGCCATGTCCCTTACAAGAAGATGCCCGCGATCCG--CGGAG Mm10095 CAG--GTTGAG----GAGCGCCATGTCCCTTACAAGAAGATGCCCGCGATCCG--CGGAG Mm99049 CAG--GTTGAG----GAGCGCCATGTCCCTTACAAGAAGATGCCCGCGATCCG--CGGAG Mn11037 CAG--GTTGAG----GAGCGCCACGTCCCTTACAAGAAGATGCCCGCGATCCG--CGGCG Mn10106 CGTTTGTCGTGCCTCGGGCGCCACTCCCCCCGCATCGTCGCGACTGCCATTTGAACTGCC Mb07020 CAA--GTCGAG----GAGCGTCAGACCACCTACAAGAGTATACCCGCGATCCA--CGCCG * ** * * * *** ** * * ** * ** ** *

219

Mn12262 CTACCCCTGAGACCGGCATCTTCCACG------AGAAACGCGCCATCT------GGT Mm10095 CCACTCCTGAGACCGGCATCATCCATG------AGAAACGAGCAATCT------GGT Mm99049 CCACTCCTGAGACCGGCATCATCCATG------AGAAACGAGCAATCT------GGT Mn11037 CCACTCCTGAGACCGGCATCATCCACG------AGAAACGCGCCGTCT------GGA Mn10106 CTGCCCTTCCGGCTCTTTCCTTGCAGGCCTCTGGCATGAACCCGCCATTCCAGACTCGGC Mb07020 CCACTCTCGAGATTGACATCGCTCACG------AGAAACGTGGCACAT------GGC * * * * * ** * * *** * **

Mn12262 TCA-----TCGCCGAGGCCG---CCCGCCAATTCTTCACCTTT-GTTATAGCT------Mm10095 TCA-----TCGCTGAGGCCG---CCCGCCAATTCTTCACCTTC-GTCATGGCT------Mm99049 TCA-----TCGCTGAGGCCG---CCCGCCAATTCTTCACCTTC-GTCATGGCT------Mn11037 TTA-----TCGCCGAGGCCG---CCCGCCAATTCTTCACCTTC-GTCCTGGCC------Mn10106 TCAGCATTCCACCAAGGTTGACACCCACGGCTCCCACGCCCCCCATCATGGCTTCGCAAG Mb07020 TCC-----TTGCCGGGGTTG---GCGTACACTTTATCAATTTC-CTCATGGCT------* * ** * * * * * * **

Mn12262 ------GGCCTCGAGTTCA-TGGAGTCGCCCCCGCCTGTCTGGGAGC--CCGAGA Mm10095 ------GGTCTCGAGTTCA-TGGAGTCGCCTCCGCCAGTCTGGGAAC--CCGAGA Mm99049 ------GGTCTCGAGTTCA-TGGAGTCGCCTCCGCCAGTCTGGGAAC--CCGAGA Mn11037 ------GGCCTCGAGTTCA-TGGAGTCGCCCCCACCCGTCTGGGAGC--CCGAGA Mn10106 AACGCGACACAAGCGACGAGGACAATGAAACCCCCCCGATCGGTCTGGCTGCTGCCAGGC Mb07020 ------GGTCTC-ATTTCCCTCGAGGAGCCCCTCGCCATTTATGAGC--CCGAGA * * * * * * ** * * * * * ** *

Mn12262 GCAACA------ACTGCGTTATC-GAGGTCGAGGCCAACACGC---ACGCCGACG Mm10095 GCAACA------ACTGCGTCATC-GAGGTCGAGGCCAACACGC---ACGCCGACG Mm99049 GCAACA------ACTGCGTCATC-GAGGTCGAGGCCAACACGC---ACGCCGACG Mn11037 GCAACA------ACTGCGTCGTC-GAGGTCGAGGCCAACACGC---ACGCCGACG Mn10106 TCGACACTGCCTTCATGGCCACATGGTCTGCCATCGAGCCCAAGCCGCGACACTACAACA Mb07020 GCAACA------ACTGCGTCAAT------AAACAATCGGC---AGGCTT--- * *** * * * *** ** *

Mn12262 G------CCGCACTGATTGCC------GCATCGCAGGCTCCCCCAAGGGCGAGCCGGCC Mm10095 G------CCGCACTGATTGCC------GCATCGCGGGCTCCCCCAAGGGCGAGGCGGCC Mm99049 G------CCGCACTGATTGCC------GCATCGCGGGCTCCCCCAAGGGCGAGGCGGCC Mn11037 G------CCGCACCGACTGCC------GCATCGCGGGCTCACCCAAGGGCGAGGCGGCC Mn10106 AAAAGGTCCCAAACCAATACTCCAGCAAGCTCCGCAAGCTTCCCGACTCGCAGCTTCACT Mb07020 ------CTTCACCAGCCCCT------CAATGCCGGACTC-----GGACGAGGCAGCC * ** * ** * * *

Mn12262 TTTACCTC------GTACGGCAATGAGGAGTGGCAG------GACTGCTCCATCGGCG Mm10095 TTTACCTC------GTTCGGCAACGAGGATTGGCAG------GACTGCTCTATCGGCG Mm99049 TTTACCTC------GTTCGGCAACGAGGATTGGCAG------GACTGCTCTATCGGCG Mn11037 TTTACCTC------GTTCGGCAACGGCGAGTGGCAG------GACTGCTCCGTCGGCG Mn10106 ATGCCCTCCTCGATGCCCAGGAGTATTGGATAGAAATCCTCGCAGACCACTCCCCGCCCG Mb07020 TTCACGAC------CTTCGGCAAGGACCAGTGGCAG------GACTGCCCCATCGGCG * * * * * * * * * *** * * **

Mn12262 TCATCAACAGCTTCACC--TTTCCCGACGCGCCCGGCCGGCCCGGCCCA---GGCCGCAT Mm10095 CCATCAACAGCTTCACC--TTTCCCGACGCGCCCGGCCGCCCTGGTCCC---GGCCGCAT Mm99049 CCATCAACAGCTTCACC--TTTCCCGACGCGCCCGGCCGCCCTGGTCCC---GGCCGCAT Mn11037 TCATCAACAGCTTCACC--TTTCCCGACGCGCCCGGCCGGCCCGGCCCG---GGGCGCGT Mn10106 CCTGGGTCAGCACCGCCAGTGTCCTCCAGCAGCTCGCCACCATCGTCCCCATGTCTCCAG Mb07020 TCATCAACAGCTCCACC--TTCCTCGACCCCCCAGGCCGTCCTGGCTCA---GGCCGTGT * **** * ** * * * * *** * * * * 220

Mn12262 CGAAGTGCAGTGGACCCGGGACGGCGGCAAGAACGGGC----CGGGGTCCGGCGACGACG Mm10095 CGACGTGCAGTGGACCCGTGACGGTGGCAAGAACGGCC----AGTCGGGCGGCCACGACG Mm99049 CGACGTGCAGTGGACCCGTGACGGTGGCAAGAACGGCC----AGTCGGGCGGCCACGACG Mn11037 CGACGTACAGTGGACCCGCGACGGCGGCAAGAACGGCC----AGAACGGCGGCCACGACG Mn10106 CGGCCAGCACCAAGCCCTACATCGCGCTCAAGGCTGTCCTCAAAAAGGCCATCCACTCTA Mb07020 CGATGTCCAGTGGACGCGCGACGGCGGCAAGCTCAG------CGACG ** ** * * * * * * * * *

Mn12262 GCTTCTGGGCGCCCGAGTTCGGCTTCATGGAGCTCAACCCCCCGG-TGTG------GT Mm10095 GCTTCTGGGCGCCCGAGTTCGGCTTTATGGAGCTCAACCCTCCGG-TGTG------GT Mm99049 GCTTCTGGGCGCCCGAGTTCGGCTTTATGGAGCTCAACCCTCCGG-TGTG------GT Mn11037 GCTTCTGGGCGCCCGAGTTCGGCTTCATGGAGCTCAACCCCCCGG-TCTG------GT Mn10106 GCACACAAGCGGGCTGGAAGGACGATGCTGACATTCGCTGTCCAGCTGTGCCACTAGCGC Mb07020 GCTTCTGGGCGCCCGAGTTCGGCTGCATGAAGCTGAGCCCGCCGG-TCTG------GT ** *** * * * * * * * ** * * ** *

Mn12262 TCTCG---GCCGAGAAGACGCTCGAGAGCCAGGGCAATGTCATCTGCGACATG------A Mm10095 TCTCG---GCCGAGAAGACGCTCGAGAGCCAGGGCAACGTCATCTGCAACATG------A Mm99049 TCTCG---GCCGAGAAGACGCTCGAGAGCCAGGGCAACGTCATCTGCAACATG------A Mn11037 TCTCG---GCCGAGAAGTCCCTCGAGAGCCAGGGCAACGTCATCTGCGACATG------A Mn10106 TCTTGTATGCCGTCAAGGTCAACGGCAGTGCGCAGAATGGAGCTGGAGACATGCCCCCCA Mb07020 TCTCG---GCCGCGAAGACGCTTGAGGAGCGCGGCAACGAATTTTGTAACATG------G *** * **** *** * ** * * *****

Mn12262 AGATCCGCGACGGCA------GCCTGTCGGA------CAAGTGGTCTGGCTGGC Mm10095 AGATCCGCGACGGCA------GCCTGTCGGA------CAAGTGGTCCGGCTGGC Mm99049 AGATCCGCGACGGCA------GCCTGTCGGA------CAAGTGGTCCGGCTGGC Mn11037 AGATCCGCGACGGCA------GCCTGTCTGA------CAAGTGGTCCGGCTGGC Mn10106 AGATCACTGAGGACATTACCGCCGCCCGCCTCGACTGGCTCACCCAGGAGCCCAGCGAGT Mb07020 AGATCGATGTCAGCA------GCCTCTCGGA------CAAGTGGTCCGGCTGGC ***** * ** *** * * ** * * ** *

Mn12262 ACGT------GTACAAGTGCGGCGTCCCGTGCCGCGATGCTGCCGATGACT Mm10095 ACGT------GTACAAGTGCGGCGTCCCCTGCCGCGATGCCGCCGACGACT Mm99049 ACGT------GTACAAGTGCGGCGTCCCCTGCCGCGATGCCGCCGACGACT Mn11037 ACGT------GTACAAGTGCGGCGTGCCCTGCCGCGACGCCGCCGACGACT Mn10106 CCGTCATCCCGAAATGCTGGCGGACTTGGGCCATTCGCCACAACTATCCCCACGTGGCGT Mb07020 ACAT------TCACAAGTGCGGCGTGCCCTGCCGCGACGCTGCCGACGACT * * * ** * * * * * * * * ** * *

Mn12262 TCGCCGT--CCAGGGC-ATGCTCCCT------CCCGT------Mm10095 TTGCCGT--CCAGGGC-ATGGCTCCAGTGTGGTCTATGCCTGTGC-CTCGAGGGGCAGGC Mm99049 TTGCCGT--CCAGGGC-ATGGCTCCAGTGTGGTCTATGCCTGTGC-CTCGAGGGGCAGGC Mn11037 TTACCGT--CCAGGGC-ATGGCCCCCTTCTATCCCGCGCCCTCGA-C------A Mn10106 ACATTGT--CCAGCAG-CAGCCTCCCGCCGAATCTATCCAAGTCCGCCGCTCCGTCAACA Mb07020 TCACTGTGGTCAGAGCCATGGCTCCAGTATGGTCTATTCCCCAGCCCCA------GGGC ** *** * ** *

Mn12262 ------TACCCTGTTGGCA---GCAG--TGCTGTGT---CTAGTCGAGA------CAGC Mm10095 GATGTCTGCTCTACCGGTGCATACGGCCTGCTGGGCGGATTAGCCGACGT----TCCTGT Mm99049 GATGTCTGCTCTACCGGTGCATACGGCCTGCTGGGCGGATTAGCCGACGT----TCCTGT Mn11037 GGCGTATGCAACACCGGGCTCTACCGCGTGCTGTCTGGCCTGGCCGAAGT----TCCTGC Mn10106 GGCCCTCGATAAGATCTCCTTCACGGCACGAGGATCGGATGAACATAAAAGGCATCCCGC Mb07020 GGCGTCTGCTCAACGGGTGTATACCGCCTCCTTGGAGGACTCGCCGAAGT----CCCCGT 221

* * * * *

Mn12262 AGC--CTGCGC--ACCGCCAC------CACCTCCTATTAGTG-GTTAC-TGC----- Mm10095 AGC--CCAAGCCTACTGCAAC------ACGCACTATCCCGTCGTG-CCGACATGC----- Mm99049 AGC--CCAAGCCTACTGCAAC------ACGCACTATCCCGTCGTG-CCGACATGC----- Mn11037 CGC--CCAAGCCTTCTGCGCT------GCCCACTACCCCGTTGCT-CCGACTTGC----- Mn10106 GGCGATTCAGCGTGCCACGGCCGATGAGCCCAGAGCTTGGACACGGTCGACATTCGAGGG Mb07020 TGC--TCAGGCCTTCTGCTCT------GCCCACTACCCCGTCGCC-CCAACATGC----- ** ** * * ** ** * *

Mn12262 TCGACCGGCATC------TA------CAAGCACCTGCTGC------CCCT Mm10095 ACCACCACCGTCATCGGTCCGGCA------AAGACATATGCCCC------CGCC Mm99049 ACCACCACCGTCATCGGTCCGGCA------AAGACATATGCCCC------CGCC Mn11037 ACCCGTACCGTCATCGGCTCGGCC------CGGACATCGGTGCC------CGCC Mn10106 ACGAGCACTATGATGACCCAGACGGAATGTTGATTGAGGACGACGAGTCTTTTGAGCTTT Mb07020 ACTAGCACTGTCGTTGGTCCGGCA------AAGACCTCTGCTGC------TGCT * * *

Mn12262 G-AGCGATTTCCCT------CCTGCTCAAGCGTTCTGCTCA---TCG-AAATTTCCCA Mm10095 GCGGCGACCTCGCCGGGGAGGCCCAAGTCGGGCGGCCTGTCCAATGCCG-AAATCACCTA Mm99049 GCGGCGACCTCGCCGGGGAGGCCCAAGTCGGGCGGCCTGTCCAATGCCG-AAATCACCTA Mn11037 GCCGCTGCTGCTCC------TA----CA-AAACCGCCCA Mn10106 GGGACGCCTCCTTCCGCGATGAGGAGCTCCGCCAAGGTCCCAGAGAGCACAAGCCACAAG Mb07020 GTGGTACCCATA------GGAAAGCCCAAG * ** *

Mn12262 TCCCTCCCAAGACCGTTAC------CGAG------Mm10095 TCTCATGCAGAAGCTGCACGACCTGGAGAGCGAGGATCTG----AGCACAGCGTGCTCAT Mm99049 TCTCATGCAGAAGCTGCACGACCTGGAGAGCGAGGATCTG----AGCACAGCGTGCTCAT Mn11037 CCTCT------Mn10106 TCGAGCTCGAAGACTCCTCCACAAGAGATTTCAGACGTGCCGGTAGCTTCCGCTCGCCCT Mb07020 C------

Mn12262 --ACCAAGACCACGACC------GTGGCAAAGCCCAAGA----GATCGGCCTTTCCT Mm10095 GCATCGAGACCCCGACCTGCGTCACGGTGATTGTCCCCAAGA-CGAAGTCGGGCTGCCGC Mm99049 GCATCGAGACCCCGACCTGCGTCACGGTGATTGTCCCCAAGA-CGAAGTCGGGCTGCCGC Mn11037 -----AAGGCCG------GCGGCTTGTCCAACAG-CGATATCG-CCTATCTC Mn10106 CGCCCGAGGGCTCAGACACCAATCGCGCGGAATCTCCACCAATCGACTTTGACCTTGCCC Mb07020 ------AGG------GCGGGCTCTCCAATGC-CGACATCA-CGTATCTC ** * * ** * * *

Mn12262 GAACA--TG--GGGTCCGCCGGGGTGACCCA--CGGAGCCCA---TTGATGGATGAGCTA Mm10095 GGTCCGTCG--GCGACCGTGACGGCGTCTCTGCCGGCAACCACCCTCGTTACTTACGCCA Mm99049 GGTCCGTCG--GCGACCGTGACGGCGTCTCTGCCGGCAACCACCCTCGTTACTTACGCCA Mn11037 ATCCA------GCGTTTGC------ACCTGCTCGAGAACAA------Mn10106 GATCGACTGCAGCGACCCGGAGACCCCTTGTCTCGTCCATTAAGACTGACCAGCAAGCAA Mb07020 ATGCA------GAGACTGC------ACCTTCTGGAGAACAA------* * * * *

Mn12262 CAA--GGCCTTAATAAGAACATGGCTAGCA-GTGCGTGC-GCATGTATCCAGACGACGCC Mm10095 CCAC-GACATTGCCGAGTCCGACGTCGACG-TCGTCCGCTGCATACTCACTGGTGACGCC Mm99049 CCAC-GACATTGCCGAGTCCGACGTCGACG-TCGTCCGCTGCATACTCACTGGTGACGCC Mn11037 ------GTTCCTGAGTAC------CGCCTGC-TCCTGTATC---GAGACTCC Mn10106 ACGTCTTCAACATCAAGTCGATGGATGACGTGCGTTCGCTGCTTACTGACCTACAATCCC 222

Mb07020 ------GGTCCTCAGCGC------TGCTTGC-TCCTGCGTCC---AGACACC ** * ** * * * **

Mn12262 CAC--CGTCACGGTCAGTTTGCCTTCACACAGGACCACCA------CAATCAC---- Mm10095 TACGTCGACTTCCTCGGTTGGCTCGACTACGTCTTCGTCAGCGGGTC--CGACTAC---- Mm99049 TACGTCGACTTCCTCGGTTGGCTCGACTACGTCTTCGTCAGCGGGTC--CGACTAC---- Mn11037 AAC------CTGTTTG------Mn10106 AGGAGCCGGTCCGCTGCCCCAACGCCGTCGAAGTCCTTCCTCATGCCAGTAACGACGTGC Mb07020 GAC------CTGTGTC------

Mn12262 -AAAAACATCTACAAAACTCACCGACCACCCTACCGGGCCGACGTCGACATGCCCTAGCC Mm10095 -ATCTCCGTCGGCGACGATCACGAGTACCGCGAGCAGCTCTACCAGCACCACCTCGACTT Mm99049 -ATCTCCGTCGGCGACGATCACGAGTACCGCGAGCAGCTCTACCAGCACCACCTCGACTT Mn11037 ------ACAGTGACAGTACCTGCCACCACCTCGACCATCGCAATGGCGTCCT Mn10106 AAGAGGCCTTTTCAGCATACACAAAGTACTTTGGCGCCAAAGTCAAGGCGGCCATGAAGT Mb07020 ------ACGGTGACAATCCCCAGCACGACCTCGTCTAGCACGAGGGC---TC ** *

Mn12262 CAAGCACGGC---AACCTTGACGGTGACGTCGG-TATCAAGCGTACC----AACC----- Mm10095 CGAGCTCGTC---CACTTCGACAACCACGGCGGGCACAGAGCCGACCCCGCAGCC----- Mm99049 CGAGCTCGTC---CACTTCGACAACCACGGCGGGCACAGAGCCGACCCCGCAGCC----- Mn11037 CGACCTCGGC---GGC---GATGGCTAGCTCGACTTCATCGTCAAC-----GACC----- Mn10106 TGACGTTGGGTCAGGTTCTGCTGGCGTCCCCGCTCGCTAAGCTCGTGAAGCAGTCAGACG Mb07020 CATCCTCAACCACAAGCTCGACCACAAGCGCGATCTCGTCGGCCACC----ACTC----- * ** * *

Mn12262 ------GTAATCACCACGACGACCAC------GAC---AACCCGTGATCCCATGAC--C Mm10095 ------GCCGACTCCTCAGGAGCCAACTCCTCAGAT---CAGTCCAGTGCCGACTCC--C Mm99049 ------GCCGACTCCTCAGGAGCCAACTCCTCAGAT---CAGTCCAGTGCCGACTCC--C Mn11037 ------GCAGTGGCCCCAACTGTCAC------TTCGACGTCGTCAAT--C Mn10106 CCATGCGGAGAGATCTCGACTACTACCGCAACGAGGCCGAGAGTGTACGCCAACAACTAC Mb07020 ------TTATCACATCCACTAC------TGCGGCTTCCTCGAC--- * * * *

Mn12262 CTGACTCAAACCACTACAGCTACAGAGATCAGTATCACCACCAC-CACTACGACTTC--- Mm10095 CAGATCTCGCCTGTTCCTACGCCGCAGATCAGCCCGGTGCCCAC-GCCGCAGATTTCCC- Mm99049 CAGATCTCGCCTGTTCCTACGCCGCAGATCAGCCCGGTGCCCAC-GCCGCAGATTTCCC- Mn11037 ---AGTTCATCAACTTCCCCGTCGATGATCACATCAACGTCGTC-TTCAACCAGTTC--- Mn10106 CCAAAGCCATCAATGTGGTCAAGGAGGCTCACAATCGCGAGGTCACGCGACTACACAAGG Mb07020 -----TTCGACATCGTCTTCGTTGGTAACTCCGACCTCGACTTCGTCCGTCGGCCCGAC- * * * *

Mn12262 ----TGTCACCACC------GTCACCCAGGACCC--TTCTCCATCTCCT Mm10095 -CTGTGCCGATTCCT------CAGATTAGCCCGGTCCCGACACCCCAGATCTC Mm99049 -CTGTGCCGATTCCT------CAGATTAGCCCGGTCCCGACACCCCAGATCTC Mn11037 ----CGCCCCTCC------ATCAACC------TCCTCCACCTCCT Mn10106 ACATTGCCGATCTGAAGGCTGTCGTGAGTAAAACATCCAGACAGCAGGCCAACAAGAGGT Mb07020 ----CGCTGCTTCATC------GACAACCAG------CACTAGCACCA * * ** * *

Mn12262 T----CCGGGGACTCTTGCCTGGATTTGCCC----TCAA-C-CTTCCGCCCCGACCAGCC Mm10095 A----CCCGTTCCTACGCCACAGATCAGCCCGGTGCCAA-CTCCTCAGATCTCTCCGGCA Mm99049 A----CCCGTTCCTACGCCACAGATCAGCCCGGTGCCAA-CTCCTCAGATCTCTCCGGCA Mn11037 ------CTTCCACCACCA--TTGCCA----CCAC-CACCACCACATCAGCAGACC 223

Mn10106 CAGAGCCGGAAGCCAATGCGTCAGCGAGTCGGTACCAAATCACCACAAGAGCTGCTGCTG Mb07020 G------CCCAACTACCA-CTAGCAG--CACTAC-CAGTACCAGTGCTGTGGCTC * * * *

Mn12262 CTGTGGCTGC-GCCTA---CGCGGC------GTCCTGCTTCACCA----CCCCTCGTA Mm10095 CCCTCGCCGCAGCCCAGTCCTCAGCCATCTCCTGTGCTGAGTCCCCAGAGGTCCCCGATC Mm99049 CCCTCGCCGCAGCCCAGTCCTCAGCCATCTCCTGTGCTGAGTCCCCAGAGGTCCCCGATC Mn11037 CCGCGCCCACCGACGA------TGATATCCC------CCCCGTG Mn10106 CTCGGGTTGGAGACCCACGCACTGATCGTCACGAGCGCAGCAGGCTGGACGCTCCTCCCG Mb07020 CTGCTGTCAC-GTCTA------GTGCTGACGTCACA----TCCTCCACG * * * *

Mn12262 CCT---ATGCCAGCCGCTGGCTCGGCCGCGACATCCCC-AC--TGGCCAGGAGTC--CTA Mm10095 CCC---TCGTCACAGCCTAGCCCGGTGCCGACACCTCAGAT--CTCCCCCGTGCCAACTC Mm99049 CCC---TCGTCACAGCCTAGCCCGGTGCCGACACCTCAGAT--CTCCCCCGTGCCAACTC Mn11037 CCA---ACACCCGAGCCCGA---AGAGCCGACGCCTGAGAT--CACGCCCGTGCC----- Mn10106 CCATGAGCACGAGAGCCCGACAAATGCAAGACAATCCGGATGCTGGCAACGGTTCCCGCA Mb07020 GCA---GAGCCGGCCATCACAACATCTTCCACAAGC---AC--CACGAGCGAGGCAACCT * ** * * *

Mn12262 CGCAGCCTGCGCCGCCTCATGCGACCGCAACT--TCTACTGCGAGCTT---TTTACGTAT Mm10095 CCCAGATCTCACCTGTTCCTACACCCCAGATCAGTCCGGTGCCGACTC---CCCAAATAA Mm99049 CCCAGATCTCACCTGTTCCTACACCCCAGATCAGTCCGGTGCCGACTC---CCCAAATAA Mn11037 ------CACACCCATATACACGCCTG------CGCCGCCCACGC---CCGCCGTAA Mn10106 AGAGAGGCCCGCCCCAGCATCACGGCGTGTCAGGTCCATTGCGGAAGCGGGTTGCAACGA Mb07020 CGAACACCACGACCTCGAGCACTACGA------CCTCGAGCAGCTCGCCCACCACTG * * *

Mn12262 AGCAG---CCAGAC--GCGCGAATGCTTCCAGTACTCTGGCGTCACCGAGTGGGAC---G Mm10095 GCCCGGTGCCGACC--CCTCAGAT--CTCCCCTGTTCCCACGCCACAGATCTCCCCCGTG Mm99049 GCCCGGTGCCGACC--CCTCAGAT--CTCCCCTGTTCCCACGCCACAGATCTCCCCCGTG Mn11037 ACCCG------CGCGATG------AGGAAGCGGGCGTTGCCGAGCGCGAC---- Mn10106 GCAAGAAGGTGGCCAACCTGAAAACGGTGTACGCCTCTTACTTTGCGGATGACGAT---A Mb07020 ACGAG------TGGGAGC------CGACTCCCATCCCGCCGACCCCACAGGAG * * * **

Mn12262 TCAACTACACCAGCTATGGGGCCGTGTACCTTTCAGAGG---CCGATAGAAT-CTCTGCG Mm10095 CCAACCCCACAGATCAGCCCGGTACCGACTCCTCAGATCTCCCCGATGCCTT-CGCCGCA Mm99049 CCAACCCCACAGATCAGCCCGGTACCGACTCCTCAGATCTCCCCGATGCCTT-CGCCGCA Mn11037 ---ATTCC------GGGCCTAGAACTGCTGGAGGTTATGGTCGCATTATATTACG Mn10106 TGAAGACCAACTTCGCTGCCATCCTCCTCCTCGCCTCCGTTGCCGCCGCGACCTCCTCTA Mb07020 CCTACTCC------CCAGGTCACTCCCGTCCCAACTCAGTATAACACACCTGCGGA * *

Mn12262 GGTC---TCACTGTGTGGGGGG------GCTGTCGTGGCG------TCTGTCTTGAGGTT Mm10095 GATCAGCCCCGTGCCTACCGAGTACCACACTCCCGTGGGGAACACCCCCGCCATCACGCC Mm99049 GATCAGCCCCGTGCCTACCGAGTACCACACTCCCGTGGGGAACACCCCCGCCATCACGCC Mn11037 AAGC---TCGGTGGCCAGCGAA------TGA------GCCATTCAG-- Mn10106 CGACGGGCCA--GGTCGAGGAGCGCCATGTCCCTTACAAGAAGATGCCCGCGATCCGCGG Mb07020 GAACACTCCCGTGATCACG------CCTGTCATCACACC * * * * *

Mn12262 TATAACACGGCCGAGGGCATGGGTGAGGGAGAGGTAAGCGTCGCC------GGG Mm10095 CGCCGTCTCTCCCGTCCCGACGGTATACAACACGCCCGCCCCGCCCACCCCGCAGACGGG Mm99049 CGCCGTCTCTCCCGTCCCGACGGTATACAACACGCCCGCCCCGCCCACCCCGCAGACGGG 224

Mn11037 TGCCGCACCTCC--TCTTGACGGCGCCGGAGACCCAAATGACCTCG------CAG Mn10106 AGC--TACCCCTGAGACCGGCATCTTCCACGAGAAACGCGCCATCTGGTTCATCGCCGAG Mb07020 GGT--GCCCACCGAGTACAACA-CTCCAGCGGCGCCCACGCCCCAGGTCAACTGGACCGA * *

Mn12262 GTG-GCCGC--GGTAGCTGGTAGTGGCAACTCGCAGAAGAAG-TACAAGCAGCAGATGCC Mm10095 CTACATCAC--GGTCGAGGACAGGGACATTCCCGAGCTGGAGCTGTCCGCGAT-GATGTC Mm99049 CTACATCAC--GGTCGAGGACAGGGACATTCCCGAGCTGGAGCTGTCCGCGAT-GATGTC Mn11037 GCACTCCGC--GCACCGGCCCAGTCTCGACCCTC-----GAG--ATCGGCATC------Mn10106 GCCGCCCGCCAATTCTTCACCTTTGTTATAGCTGGCCTCGAGTTCATGGAGTC-GCCCCC Mb07020 CGAAGGCGC--ACTGGAGGAGAGGGACATTCCGGAGCTGGACATCTCGATGGA-TATGTC * * * *

Mn12262 GCGGACCGGACGGATCACGCCCC-TCGTGGCGCTACTG--GGCCTCACCAGCCTGCTCGC Mm10095 GCCCACCACGCGGATGGCACATC-TCGTGGGGCTTCTG--GGCCTCGCGAGCCTGCTCGC Mm99049 GCCCACCACGCGGATGGCACATC-TCGTGGGGCTTCTG--GGCCTCGCGAGCCTGCTCGC Mn11037 ----ATCGTGGGGGTCGTGATTG-CGGTCACGCTCCTG--CTAGTCGCGATCTTTGTG-- Mn10106 GCCTGTCTGGGAGCCCGAGAGCAACAACTGCGTTATCGAGGTCGAGGCCAACACGCACGC Mb07020 ACACGCCACGGGGATC-CGATCCTTCGCGGGGTTCCTG--AGCATCGCCAGCTTGCTCGG * * * * * * * *

Mn12262 TGGCGCCAGCGCCGAAAACAAGATCACGTATCCCGACAAGGACGGCATGACGTTCTACCA Mm10095 TGGCGCCAGTGCCGAGAACAAGATCACATACCCCGACAAGGACGGCATGACATTCTACCG Mm99049 TGGCGCCAGTGCCGAGAACAAGATCACATACCCCGACAAGGACGGCATGACATTCTACCG Mn11037 TGGCGGGCACGC-AAAAACAAGA------GGCTCGAGCAGGACAG----GCTCGCCGCTC Mn10106 CGACGGCCGCACTGATTGCCGCATCGCAGGCTCCCCCAAGGGCGAGCCGGCCTTTACCTC Mb07020 CGCCGCCGTGGCCGACAACAAGATCACGTACCCCGACAAGGACGGCATGACGTTCTACCA * ** * * * * * *** * * *

Mn12262 GGGCGACTCCCT-CACCGTCAGA--TACGTGACCGACTACGCGAGCC-CTGAGCTGCACC Mm10095 AGGCGACTCGGT-CACTGTCCAA--TACGTGACCGACTACACGAGCC-CCGAGCTGCACC Mm99049 AGGCGACTCGGT-CACTGTCCAA--TACGTGACCGACTACACGAGCC-CCGAGCTGCACC Mn11037 AGATGGTTAGCGACGCCGAAGAA--TGCACCTCCGAC--CGCGGATC-TCGGGAAGAAC- Mn10106 GTACGGCAATGAGGAGTGGCAGGACTGCTCCATCGGCGTCATCAACAGCTTCACCTTTCC Mb07020 GGGCGACTCCGT-CACTATCAAA--TACGTCACCGACTACGCGAGCC-CCAAGTTGCACC * * * ** * * *

Mn12262 TGTTCTGCTTTG---AACAGGGCAGCGATGTCATTGTCG------Mm10095 TGTTCTGCTATG---AGCAGGGCAGCAATGTCATTGTCG------Mm99049 TGTTCTGCTATG---AGCAGGGCAGCAATGTCATTGTCGGTATGTCTGCTGAATCCACAG Mn11037 -AGCATACCATA---GGCT--TCATCGCCAGCATTGCCG------Mn10106 CGACGCGCCCGGCC-GGCCCGGCCCAGGCCGCATCGAAGTGCAGTGGACCCGGGACGGCG Mb07020 TTTTCTGCTACCCGAAGACTGGCGATGGGGTACTAGACAAGCTCGAAACATCCATTGCGA * * * *

Mn12262 ------ACAAGCTCA Mm10095 ------ACAAGCTCA Mm99049 ACGCCCTGCAGATGGCAGCATCAGATTCGACTAACACTCCCCGGCGTACAGACAAGCTCA Mn11037 ------Mn10106 GCAAGAACGGGCCGGGGTCCGGCGACGACGGCTTCTGGGCGCCCGAGTTCGGCTTCATGG Mb07020 CAGGCACCGGTTCGCAGACCGTCACGATCGATCTCAGCGGGGTCGGCGGCTGCTGGTTCA

Mn12262 AAC--AACCCATCCAAACAGGCACCGGCTCA-AAGACTGTCGCAATCGACCTCGACGGCG Mm10095 AGC--AATCCATCCAGTCAGGCACCGGCTCG-AAAACTGTCGCAATCGACCTCGACGGCG 225

Mm99049 AGC--AATCCATCCAGTCAGGCACCGGCTCG-AAAACTGTCGCAATCGACCTCGACGGCG Mn11037 ------ACACGGACAAGATACTGGCCGCCGA-GAGACCCGAGCGAGCAGC---AACGGCG Mn10106 AGCTCAACCCCCCGGTGTGGTTCTCGGCCGAGAAGACGCTCGAGAGCCAGGGCAATGTCA Mb07020 ACATCCAGACGAGCGGGAACATCTGGGGCGTCAACAGCCCGATATGGACGTTCGGCGGCA * * * * * * *

Mn12262 TCAGCACCTGCTGGTTCAACATCCAGACCAACGGCAACATCTGGGGCGTCAACAGCCCGC Mm10095 TCAGCACCTGCTGGTTCAACATCCAGACCAACGGCAACATCTTTGGCGTCAACAGCCCGT Mm99049 TCAGCACCTGCTGGTTCAACATCCAGACCAACGGCAACATCTTTGGCGTCAACAGCCCGT Mn11037 AGACCTCGCACAAGCA----GTCTAGACCATTC----CGTGTGGTCAGACTGGGGTTCGC Mn10106 TCTGCGACATGAAGATCCGCGACGGCAGCCTGTCGGACAAGTGGTCTGGCTGGCACGTGT Mb07020 GCCGACCCATGCAGCA----GACCGCTACGCCTCCACCGCCGGCCGCGACGGGCCAGACC * * * * * *

Mn12262 TCTGG-ACATTTCAGAGCAGCCGGCCAGCACAGCAGACGACGACGACTACGATAA-GCCT Mm10095 TCTGG-ACGTTCCAGAGCAGCCGGCCGGCCGAGCAGACGACGAG---CACGACAA-GCCC Mm99049 TCTGG-ACGTTCCAGAGCAGCCGGCCGGCCGAGCAGACGACGAG---CACGACAA-GCCC Mn11037 CC--G-GCGGTAGAGAGCAGA-----GGCGT----GATGGTGAGC-TCACCCGAG-TTTC Mn10106 ACAAGTGCGGCGTCCCGTGCCGCGATGCTGCCGATGACTTCGCCGTCCAGGGCAT-GCTC Mb07020 ACTGCGGTGGTAATAAGTAACACGATGAACTTGGGTACGGGATCCCCAGCGCCATCTCTT * * * *

Mn12262 GTCGCCAC----CTGCAACTAGCACACCCACCCCGAGT----GCCGCAGT--CACAATCG Mm10095 ATCGCCACT-CCCTACAGCCAGCACTCCCGCGCCGAGT----CTCGCAGT--CACAATAA Mm99049 ATCGCCACT-CCCTACAGCCAGCACTCCCGCGCCGAGT----CTCGCAGT--CACAATAA Mn11037 GATGCAGC-----GATGGCCAACATTACTG-ATTGGGA----ATAGTAG------AG Mn10106 CCTCCCGTTACCCTGTTGGCAGCAGTGCTGTGTCTAGTCGAGACAGCAGC--CTGCGCAC Mb07020 GATAATAATGCCTCGCCGACGGACACGCCGCCATCGCCGACCACGGCGGCGGCGCAGCAG * * *

Mn12262 TTACACAA------ACAGGTA------CGACAGCCCCTGAACCCT-----CC Mm10095 CTACACAA------ATAAGTA------CGCCAGCTCCTGAGCCCT-----CG Mm99049 CTACACAA------ATAAGTA------CGCCAGCTCCTGAGCCCT-----CG Mn11037 GAGCACGA------AATAGCA------CATCGGCTCCAAG---CA-----CC Mn10106 CGCCACCACCTCCTATTAGTGGTTACTGCTCG-ACCGGCATCTACAAGCACCTGCTGCCC Mb07020 TCACCCAGGCCATCCTCGCCGAATGACGCTCCCGGCACAAGCCCGTCAGGCGGAGCTTCT * * * * *

Mn12262 TCG---GATACGTCCCCTGCGGAGGA-CCAGAGCACAGCAG-----CAGCGCCAT-TGCC Mm10095 TCG---GATGCGTCGTCAGCGGGGGG-GTCAAGCACAGCAG-----CGGCGTCAT-CGCC Mm99049 TCG---GATGCGTCGTCAGCGGGGGG-GTCGAGCACAGCAG-----CGGCGTCAT-CGCC Mn11037 ATG---GATACCTC-CCAG------AAGCACA------CTGCGCCGT-TCCT Mn10106 CTGAGCGATTTCCCTCCTGCTCAAGC-GTTCTGCTCATCGAAATT-TCCCATCCC-TCCC Mb07020 GCGGGAGGCGCTTCTCTGGGTACCGCTGACGAACCGACCACGGCTCCGGTGTCAGGTGCT * * * * * * * *

Mn12262 AAGCCCACTCTCGC--TCGGCGTATCTACGATCACTAAGCCCA--CGAA--CAGCCCTAT Mm10095 AAGCCCGCCTCCGC--TCGGCATCTCTACAAATACGGAGCCCA--CAAA--CAGCCCTCT Mm99049 AAGCCCGCCTCCGC--TCGGCATCTCTACAAATACGGAGCCCA--CAAA--CAGCCCTCT Mn11037 AACATATCTCCCAG--TCGAGATC----CGCCTGCAGA-TCTA--CGA---CATCTGCAT Mn10106 AAGACCGTTACCGAGACCAAGACCACGACCGTGGCAAAGCCCA--AGAGATCGGCCTTTC Mb07020 AACTCGGCTGGTGTATCTGCAGCGTCGACCGGCACTTCCCCGAACCCAACACAGCCTCCA ** * * * * * * *

Mn12262 CC-----CTGGATCTACTGTGG----CCACGGCATCAGATGCTAACCTGACCGTCATGTT 226

Mm10095 CC-----CCGGATTTGCCACGG----TCACAGCATCAGATGCTAACTCGGTCGGCGTCTC Mm99049 CC-----CCGGATTTGCCACGG----TCACAGCATCAGATGCTAACTCGGTCGGCGTCTC Mn11037 GC------TCTACTACAT----CCA------AGATGCTC-CGCAATC--CGTCCT Mn10106 CTGAACATGGGGTCCGCCGGGGTGACCCACGGAGCCCATTGATG-GATGAGCTACAAGGC Mb07020 TCGAACGCGGCGCCCAGTGGTACTAGCAATACCAGCACCACAACCGGCGGCAATACCGGC *

Mn12262 CACAGCATCAGACCAGCCCTC--CG-TGAGCTTGACTCTACAGCCGACAGTGATCGTAAC Mm10095 CGCAGCATTCACCGGGACCCC--TG-CGGGCCCGACACAGCAGGCGCCGGCAACCGTGAC Mm99049 CGCAGCATTCACCGGGACCCC--TG-CGGGCCCGACACAGCAGGCGCCGGCAACCGTGAC Mn11037 CCCAGTTGGACACTGGTCGTT--CG-AGACCTCG------CAAACCCCG------AGT Mn10106 CTTAATAAGAACATGGCTAGCAGTG-CGTGCGCATGTATCCAGACGACGCCCACCGTCAC Mb07020 GATAACAGTAACGGGACCGGTGGTGGTGATGGTGGCAACACTGGCAACGGCGGTGGTAGC * * * * * *

Mn12262 GCCAGGCACCAACAACGGCAACATCAGCGGCAACAGCAACGAGAGCAGCAACAA--CGAA Mm10095 GCCAGGCGGCAGCAGCAGCAACG------AGGGTAACAGCAGCAA--CGAA Mm99049 GCCAGGCGGCAGCAGCAGCAACG------AGGGTAACAGCAGCAA--CGAA Mn11037 CCCGAGGAGCAGC--TGATCGCAT------TCGAGAGCTCCTCGCA--GGAC Mn10106 GAGGACCACCACAATCACAAAAACATCTACAAAACTCACCGACCACCCTACCGGGCCGAC Mb07020 AGT-ACTACGACCACGAACAACACCACCACCAACAACAATAACAACGGCGGTAG--CGAC * * **

Mn12262 GGCGGCCTTTCCACAGGCGCGAAAGCCGGCATCGGCGCCGGTGTCGGCGTTG-GCGCTGC Mm10095 GGCAGCCTCTCCATCGGCGCGAAAGCCGGCATCGGTGCTGGCGTCGGCGTCG-GCGCCAT Mm99049 GGCAGCCTCTCCATCGGCGCGAAAGCCGGCATCGGTGCTGGCGTCGGCGTCG-GCGCCAT Mn11037 AGC--CCGCTTCTGATGCTCACGTGCAAGCA-----AATCGCGCGGGAGCTG-CCGGC-- Mn10106 GTCGACATGCCCTAGCCCAAGCACGGCAACCTTGACGGTGACGTCGGTATCAAGCGTACC Mb07020 GGCGCGCTGACCCCCGCCGCGAAAGCCGGCATCGGCGCCGGTGTGGGTGTGA-GCGCTGC * * * * * * ** **

Mn12262 CCTCGCCATTGCGGGGCTTGTGTACTTCATCCTGGCCCGGCGGAAGCGGCGCGCACAAAA Mm10095 CCTCGCCATCGCGGGGCTCATCTTCTTCATCCTGGCCAGGCGGAAGCAGCGCGCGCGCAA Mm99049 CCTCGCCATCGCGGGGCTCATCTTCTTCATCCTGGCCAGGCGGAAGCAGCGCGCGCGCAA Mn11037 ---CGTGGCCGGACGACACATCCGCTTCGTCAT------CGGCAGCAAGGC------AA Mn10106 AACCGTAATCACCACG-ACGACCACGACAACCCGTGATCCCATGACCCTGACTCAAACCA Mb07020 CCTCGCCATCGCGGGATTGGTGTTCTTCCTCATGGCCCGACGGAAGCACCGCG---AGAA ** * * * * * * * *

Mn12262 TCACCAGTCC-GTGGCTGAGCTCCCCGCTGGAGAGGCCACAAACTATGGAGCCGGCGGC- Mm10095 CCAGGGGCAC-GCGGCCGAGCTTTCAGTCGGAGAGACCACGCACTACGGAGCGGGCGGC- Mm99049 CCAGGGGCAC-GCGGCCGAGCTTTCAGTCGGAGAGACCACGCACTACGGAGCGGGCGGC- Mn11037 CAATGAGCCC-GCAGCCG------CAGCCGCAGCAGCCCCGGGACCTGCGGCGGGAGGC- Mn10106 CTACAGCTACAGAGATCAGTATCACCACCACCACTACGACTTCTGTCACCACCGTCACC- Mb07020 CAACCGGACT-ATGGCCGAGCTCCCTGCCGTGGAGAATACGGATTATGGCGGCGGCGGCG * * * * *

Mn12262 --TTTGGCCAGCCGCAGCAGC------TGT--ACTTCCCGTCCGCA------GC Mm10095 --TTTGGCCCGTCGCAGCAGCAGCAACTGT--ACTTCCCGTCCGCA------GC Mm99049 --TTTGGCCCGTCGCAGCAGCAGCAACTGT--ACTTCCCGTCCGCA------GC Mn11037 ----CGGCCTCTCGCTGCCAC------GGC--GCT------GCA------G- Mn10106 --CAGGACCCTTCTCCATCTCCTTCCGGGG--ACT-CTTGCCTGGA------TT Mb07020 GGTTTGGGCAGTCGCAGCACTACCTTGGGTCGGCGGCTGGGATGGACAAGAATGGGAGGC * * * * * * * *

227

Mn12262 TGCCGCGGACAA---GAGCCA--TGGCATTCCG-GTCGGGGC-----TGGTGGTTGGTCG Mm10095 TGCCGCCGACAA---GAGCCA--CGGCATCCCG-GCAGGGGC-----TGGCGGCTGGGCG Mm99049 TGCCGCCGACAA---GAGCCA--CGGCATCCCG-GCAGGGGC-----TGGCGGN-GGGCG Mn11037 -GCCGC------GCGCCG--TGGCCCTGCG-GC---ATC-----TGCAGCTCGAGTG Mn10106 TGCCCTCAACCTTCCGCCCCGACCAGCCCTGTG-GCTGCGCC-----TACGCGGCGTCCT Mb07020 AACCAGGGCATGACGGCGGCAGCTGGCCACCCGCGCCACAGCAGACGCAGCAACCGTATG ** * * ** * * * *

Mn12262 GCGGCGCC---GCCAG------AGGCGCAAG---GGCAGCAG----CAGA---GC Mm10095 CCGCCGCCACCGCCAG------AGACACAGG---GGCAGCAG----CAGA---GC Mm99049 CCGCCGCCACCGCCAG------AGACACAGG---GGCAGCAG----CAGA---GC Mn11037 GCGCCTCCACCCCACG------ATCCTCAAG---GGCGGCG------AGC---TC Mn10106 GCTTCACCACCCCTCGTAC-CTATGCCAGCCGCTGGCTCGGCCGCGA----CATCCCCAC Mb07020 CTGTCCCCGGACAGAACGGGCCGTATGAGATGTACACAGGACCACAGGGCCCGGAGTTGG * ** * * * *

Mn12262 GAGCCGTATGAGCTGTACACGGGGCCGCAGGGATCGGAGCTGCCTGCTACGGAGCGGGCG Mm10095 GGGCCGTACGAACTGTACACGGGGCCGCAGGGGTCGGAGCTGCCTGCGACAGAGAGGGCA Mm99049 GGGCCT------CCTCA------Mn11037 GTGCCC------GACGCCA------Mn10106 TGGCCAGGAGTCCTACGCAGCCTGCGCCGCCTCATGCGACCGCAACTTCTACTGCGAGCT Mb07020 CGGCCGTGGAGAGGCCGCAGGAGGTGCCATCACCGTCGAGACCACAATTTAAAGTCGG-G *** *

Mn12262 GGGGGGAAGAGGTGGATGGTCGCATTGTTGCAAAAAG--CCCGGTCGCCAGCGGATGAGC Mm10095 GAGGGTAAGAGGTGGATGGCCGCATTGCTTCACAAAG--CTCGGTGGCCGGCGGATGAGC Mm99049 ------TGGCCGCATTGCTTCACAAAG--CTCGGTGGCCGGCGGATGAGC Mn11037 ------AGATGCGCGGCTGCACCGAG--GCCATCCGCGCGTGCCTGG-- Mn10106 TTTTACGTATAGCAGCCAGACGCGCGAATGCTTCCAGTACTCTGGCGTCACCGAGTGGGA Mb07020 GAGAGAGGGAAATGGATGGTTGCATTGGTTCCCGAAG--CCCGGTGGCCGGCGGATCAGC * ** * * ** * * * *

Mn12262 CTT------TCAGC------GCCGCACCTCCACTAGACGGCGCTGGAGACCCAACCGA Mm10095 CTT------TCAGT------GCCGCACCTCCTTTAGACGGCGCTGGAGACCCAACCGA Mm99049 CTT------TCAGT------GCCGCACCTCCTTTAGACGGCGCTGGAGACCCAACCGA Mn11037 ------TCGGC------GG------TCCCGGCGACGGCGATGGGGCGCCGGGGGG Mn10106 CGTCAACTACACCAGCTATGGGGCCGTGTACCTTTCAGAGGCCGATAGAATCTCTGCGGG Mb07020 CAT------TCAGT------GCCGCACCTCCCCTCGATGGCGCTGGAGACCCGAACGA * * * * ** * ** * * * *

Mn12262 CCTCGC---GGGAACTGGACGCACTGGCCCA-GTCTCGAC------TCTGGAGGTGGG Mm10095 CCTCGC---GGGAACTGGACGCACTGGCCCA-GTCTCGAC------TCTGGAGGTGGG Mm99049 CCTCGC---GGGAACTGGACGCACTGGCCCA-GTCTCGAC------TCTGGAGGTGGG Mn11037 GATCAC---GTCGGCTCGCGTCACCGTCGTC-ATCTCGGG------CCCGCTCGCGAG Mn10106 TCTCACTGTGTGGGGGGGCTGTCGTGGCGTCTGTCTTGAGGTTTATAACACGGCCGAGGG Mb07020 CCTCGC---AGGCACTCCACGCACTGGACCA-GTCTCGAC------TCTAGAGGTGGG ** * * *** * * * *

Mn12262 CATCATCGTGGGAGTCGTCATCCTCGTCACGCTCCTCCTCGTCACAATCTTTGTGTGGAG Mm10095 CATCATCGTGGGAGTCGTGATCGTCGTCACGCTCCTTCTCGTCGCAATCTTTGTGTGGAG Mm99049 CATCATCGTGGGAGTCGTGATCGTCGTCACGCTCCTTCTCGTCGCAATCTTTGTGTGGAG Mn11037 CTCCTTTGTG------CCGCAGGCCACCTCGTC-----TTTCGGCTCGAT Mn10106 CATGGGTGAGGGAGAGGTAAGCGTCGCCGGGGTGGCCGCGGTAGCTGGTAGTGGCAACTC Mb07020 CGTCATCGTCGGGGTCGTCATCGCAGTCACTCTGCTCCTCGTTGGAATCTTCGTGTGGCG * * * ** * 228

Mn12262 GGCACGAAAGAACAAGAAGCTCGAGCAA------GATAGACTCGCCGCTCAGATGGCTAG Mm10095 GGCACGCAAGAACAAGAAGCTCGAGCAG------GACAGACTCGCCGCTCAGATGGCCAG Mm99049 GGCACGCAAGAACAAGAAGCTCGAGCAG------GACAGACTCGCCGCTCAGATGGCCAG Mn11037 GCGCCTCACCAACGTCTACGTCGAGC------CGGCGCT-GGATGTCATG Mn10106 GCAGAAGAAGTACAAGCAGCAGATGCACCTCTCCGCTCTGCCCCTGGCTCTGGCCGCCTT Mb07020 AGCACGCAAGAACAAGAAGCTCGAACAA------GACAGACTCGCCGCCCAAATGATCAG * ** * * * **

Mn12262 CGACGTCGAAGAGTGCACCTCGGACCGCGGATCCCGGGAGGAGCAGCATACCATGGGCTT Mm10095 CGACGTCGAAGAGTGCACCTCGGACCGCGGGTCTCGAGAGGAGCAGCACACGATGGGCTT Mm99049 CGACGTCGAAGAGTGCACCTCGGACCGCGGGTCTCGAGAGGAGCAGCACACGATGGGCTT Mn11037 ----GTCGA----CATGCTGCGGGCCACGTGTCCCGCGCTGCGGAGCAT----CGAGTTT Mn10106 GGCCCTCAGCACCGCGACCTCTGCCATCGAGCTCCACATCAACATCACTACCGGAGGCGT Mb07020 CGATGCCGAGGAATGCAGCTCGGACCGCCCGTCGCGAGAGGACCAGCACACCATCGGCTT * * * * * * * *

Mn12262 CATCACCAGCATTGCCGACACGGACA------AGATCCTGGCCGCTGAGAGGCCCGAGA Mm10095 CATCGCCAGCATTGCCGACACGGACA------AGATCCTAGCAGTCGAGGGACCCGAGA Mm99049 CATCGCCAGCATTGCCGACACGGACA------AGATCCTAGCAGTCGAGGGACCCGAGA Mn11037 G------TAGGGATATACACG------ACAAAGTGGCTGC--GGAACCTCGAGG Mn10106 GCTCGTTCCGGTTGTCTTTGCCTACGTTAGTGCAGACAACGGCGACTACTACCCGTTCGG Mb07020 CATCGCCAGCATTGCCGACACCGACA------AGATCTTGGCCGCTGAGAGACCTGACA * ** * * ** *

Mn12262 GGATAGCAGCAGCAAGAACTCGC--GCGGGTAGTGTAGACCAT------TCCGTGTGGT Mm10095 GGGCAGCGGCGGCGAGAACTCGC--GCGGGCAGTGTGGAACAT------TCCGTGTGGT Mm99049 GGGCAGCGGCGGCGAGAACTCGC--GCGGGCAGTGTGGAACAT------TCCGTGTGGT Mn11037 CGAGCTTGGCAG-AGGAGCATGG--GATACAAGTCAAGCGCGG------CCGATCGGGT Mn10106 CTACTTCGTCGACGGGTGCAAGGCGAGGGACAACCTCAAGCAGATCTGCATCGACGGGAG Mb07020 GGACTGCTGCCACGAGAGCTCGC--ACAGGCAGTGTAGATCAT------TCGGCGTGGT * * * * * * * *

Mn12262 CGGAC-TGGGGCGCA-----TCTGG-----CGGTAGAGAGCAGCAGC-GAGATG------Mm10095 CAGAC-TGGGGCACA-----CCTGG-----CGGTAGAGAGCAGCGGC-GAGATGGTGAGC Mm99049 CAGAC-TGGGGCACA-----CCTGG-----CGGTAGAGAGCAGCGGC-GAGATGGT---- Mn11037 CCGGA-AAGGGGGC------CTGG-----ACATATACAGCTGCA------Mn10106 CAGACATAGGGCCCACGTAGTCTGGGATACAGGCAGGAAACAATGCCTGAGCCGGACCAA Mb07020 CGGAC-TGGGGCTCG-----CCTGG-----CAATCGAGGGCAGCGGC-GTGATG------* * *** * **** *

Mn12262 ------TGGAGGAGCA Mm10095 TACACCCTCATTACGTTCCCGCGATGGCCGCTATGACTAACTGGATACAGTGGAGGAGCA Mm99049 ------GAGCTACA Mn11037 ------Mn10106 AGTCGACTCGGTCACTGACGGCAGGAACCGCTATGAGTACCAGACTTACACGGAAGTGGC Mb07020 ------TGGAGGAGCA

Mn12262 CGAGA------TAGCGCATCGGCT--TCAAGC------Mm10095 CGAGAATGTCTACTGAGCAGCCCAACACCCCGTCCATT--TCAAGCCCTGCAAGCGGAGG Mm99049 C------CCTCATTACGT--TCCCGC------Mn11037 ------TTATG------Mn10106 CTGCACGTGGATGGTCGCATTGTTGCAAAAAGCCCGGTCGCCAGCGGATGAGCCTTTCAG Mb07020 CGAGA------TAGAACATCGACT--CCAAGC------229

Mn12262 ------ACCTAT------Mm10095 AATGCAATCTGTCATGTCGACCCTCCGTGGGGCTCTCCGGCCCGGGGACGCCCCACGACA Mm99049 GATGGCCGCTAT------Mn11037 ------TGT------Mn10106 CGCCGCACCTCCACTAGACGGCGCTGGAGACCCAACCGACCTCGCGGGAACTGGACGCAC Mb07020 -ACCCAAGCCAT------

Mn12262 ------Mm10095 AGACAGACAGAACAACTCGTCTCTGGCGTCAACTCAGTCGAAACAGCATGTACCACTGCG Mm99049 ------Mn11037 ------Mn10106 TGGCCCAGTCTCGACTCTGGAGGTGGGCATCATCGTGGGAGTCGTCATCCTCGTCACGCT Mb07020 ------

Mn12262 ------Mm10095 ACAGCACGAGCAAGAACCACAACAACAACTTCGCGACGACTACAACCCTGATGAGAACCA Mm99049 ------Mn11037 ------Mn10106 CCTCCTCGTCACAATCTTTGTGTGGAGGGCACGAAAGAACAAGAAGCTCGAGCAAGATAG Mb07020 ------

Mn12262 ------GCCCTGCCC------Mm10095 GTCGCGGGCCCTCCCCGAGGCCCATCGTGACGC-----CACTGGCCACACGACGGTGCTC Mm99049 ------GACTAACTGGA------Mn11037 ------GC------Mn10106 ACTCGCCGCTCAGATGGCTAGCGACGTCGAAGAGTGCACCTCGGACCGCGGATCCCGGGA Mb07020 ------GCTCACCGGGA------*

Mn12262 ------CCCAGGA-- Mm10095 GATAGCATTATGCCGGGAAACCAGGGCTTGTCGGGGAAGCCAGGCTGGGGTTCTAGGAGC Mm99049 ------TACAGTGG- Mn11037 ------Mn10106 GGAGCAGCATACCATGGGCTTCATCACCAGCATTGCCGACACGGACAAGATCCTGGCCGC Mb07020 ------

Mn12262 ------AGTAGTA------Mm10095 GATTCGAACCCGGCGCCTATCACCGCCAGCAGTAGCAGCAATGGAGGCGGGATCCTCGGT Mm99049 ------AGGAGCA------Mn11037 ------AGCA------Mn10106 TGAGAGGCCCGAGAGGATAGCAGCAGCAAGAACTCGCGCGGGTAGTGTAGACCATTCCGT Mb07020 ------AGAAAGAT------*

Mn12262 ------Mm10095 GCGCTAGACTCAACTCGCCAACGCCGGAGCAGCGCTACGAGCTTGCATGAAGAAACTGGC Mm99049 ------Mn11037 ------Mn10106 GTGGTCGGACTGGGGCGCATCTGGCGGTAGAGAGCAGCAGCGAGATGGTGAGTCACACCT 230

Mb07020 ------

Mn12262 ------T Mm10095 A----ACAGTGACGAGCAGCAGCAATTCGCCACCACCGAACGACAGAGGCATTTGATAAT Mm99049 ------CGAGAAT Mn11037 ------T Mn10106 ATATTACGTTCCTACGATGGCCGGCATGACTGACTGGATACAGTGGAGGAGCACGAGAAT Mb07020 ------GGCTTCCCCCAA

Mn12262 GCCACTCAGCAACCCCAGCGACGACTCCGATGCTCTCGCAGTCGAGGCAGCATCCCGCCT Mm10095 GGCACCCAATAACCCCAGCGACGACTCCAATGCTCTCGCGGCCGAGGCAGCATCCCGCCT Mm99049 GGCACCCAATAACCCCAGCGACGACTCCAATGCTCTCGCGGCCGAGGCAGCATCCCGCCT Mn11037 GTCCCGGAATAGTCCCAGAGATGACCCCGACGCTACCGCCGCCGAGGCAGCATCCCGCCT Mn10106 GCCACTCAGCAACCCCAGCGACGACTCCGATGCTCTCGCAGTCGAGGCAGCATCCCGCCT Mb07020 GGAGCACCGCAGCCCTAGCCCCGAGCCCGGCGGTGCAACAGCAGAAGCAGCATCCCGCCT * * * ** ** ** ** * * * * ** **************

Mn12262 CGACGCGGAGCTCGACATCCTAGCCGCAATGTACCCGGACGCCGGCGCCGTGGTCTACAG Mm10095 CGGCGCGGAACTCGATATCTTGGCCGCAATGTACCCGGAAGCGGGTGCTGTGGTCTACAC Mm99049 CGGCGCGGAACTCGATATCTTGGCCGCAATGTACCCGGAAGCGGGTGCTGTGGTCTACAC Mn11037 CGGCGCGGAGCTTGACATCCTGTCTGCAATGTACCCGGAAGACGGCGCCGTCGTTTACAG Mn10106 CGACGCGGAGCTCGACATCCTAGCCGCAATGTACCCGGACGCCGGCGCCGTGGTCTACAG Mb07020 CGCCGCGGAGCTGGATATCCTCTCTGCGATGTACCCCGACCCGGGCGCCGTGACCTACAT ** ****** ** ** *** * * ** ******** ** ** ** ** ****

Mn12262 CCACGCACGAGGCCGCGAGGTTCGCTTCACCATGGACA---CGGGAACTCTGGTC--CTC Mm10095 CCCGACACGAGGCCGCGAGGTTCGCTTCACCATGGACA---CGGGAACTCTGGTC--CTC Mm99049 CCCGACACGAGGCCGCGAGGTTCGCTTCACCATGGACA---CGGGAACTCTGGTC--CTC Mn11037 CCTCGCGCGCGGGCGCGAGGTCCGCTTCGCCATGGACA---CGGGGACTCTGGTC--CTC Mn10106 CCACGCACGAGGCCGCGAGGTTCGCTTCACCATGGACA---CGGGAACTCTGGTC--CTC Mb07020 GCAC---CGCGGCCGGGAATTCCGCTTCATGATTCCCAGTCCCGGGATCGAGACCGGCAC * ** ** ** ** * ****** ** ** * ** * * * * *

Mn12262 CGCCTCCCGGAGTTCTAT------CCCGTAACCGGGTTCCCCGAGATCCTGACCG Mm10095 CGCCTTCCGGAGCTCTAC------CCCATGACCGGGTTCCCCGAGATCCTGACAG Mm99049 CGCCTTCCGGAGCTCTAC------CCCATGACCGGGTTCCCCGAGATCCTGACAG Mn11037 CGCCTCCCGGAGCTCTAT------CCCGTCGCGGGAAGCCCGGAGATCCTGACGG Mn10106 CGCCTCCCGGAGTTCTAT------CCCGTAACCGGGTTCCCCGAGATCCTGACCG Mb07020 CGGCACCAGGGGCAGCACGGGAGCGGCGACGCTCGTCCTGCGCCTCCCGGAGCTTTACCC ** * * ** * * * * * * * * ** * * **

Mn12262 CGTCGTGGTCCTCCGCCACCTC------ATCCGACAGAACCGGCAGCAGACACAAGG Mm10095 CGTCGTCGTCCTCCTCCTCCTCCTCCCAGTCATCCGCCAGAACCAGCAGCGGAAACAAGG Mm99049 CGTCGTCCTCCTCCTCCTCCTCCTCCCAGTCATCCGCCAGAACCAGCAGCGGAAACAAGG Mn11037 CCGCCTCGTCCTC------ATGCGCCGGCGACCACAAGG Mn10106 CGTCGTGGTCCTCCGCCACCTC------ATCCGACAGAACCGGCAGCAGACACAAGG Mb07020 CGTCACAGGCCACCCCGAGATCCT------GACCGCGTCGTCGGGGAACAGCCACCAAG * * ** * * ** * *

Mn12262 ACCTTCGCGACCAGACTAGAGCTGCAGTCGAGGCTCTCGGCCTCCCCGCCGTGGGGGGCG Mm10095 ACCTTCGCGATCAGACTAGGGCTGCAGTCCAGAATCTCTGCCTGGCCGCCGAGGGAGGCG Mm99049 ACCTTCGCGATCAGACTAGGGCTGCAGTCCAGAATCTCTGCCTGGCCGCCGAGGGAGGCG Mn11037 ACCTGCGCGACCAGACCCGGGCTGCAGTCCACGCCCTCGGTCTCCCCGCCGAGGGAGGCG 231

Mn10106 ACCTTCGCGACCAGACTAGAGCTGCAGTCGAGGCTCTCGGCCTCCCCGCCGTGGGGGGCG Mb07020 ACCTGAGGGACCCGACAAAAGCTGCAGTCGGCGCTCTCGGGCTGCCGGCGGAAGGGGGCG **** * ** * *** ********* *** * ** * ** * ** ****

Mn12262 ACGAGGTCCTCGACGCCCTGATTCTCGCTTTCCAGCAAGTCGTCGAGGCGGAGCAACGAC Mm10095 GCGAGGTTCTCGACGCGCTGTTTCTCGCTTTCCAGCAAGTCGTCGAGGCAAAGCAACGAC Mm99049 GCGAGGTTCTCGACGCGCTGTTTCTCGCTTTCCAGCAAGTCGTCGAGGCAAAGCAACGAC Mn11037 ACGAGGTCCTCGACGCGCTGATTCTCGCCTTCCAGCAAGTCGTCGAGGCTGAACAACGAC Mn10106 ACGAGGTCCTCGACGCCCTGATTCTCGCTTTCCAGCAAGTCGTCGAGGCGGAGCAACGAC Mb07020 AAGAGGTTCTCGACGCGCTGATACTGGCTTTCCTGGAGGTCGTTGAGGCCGAGCAGCAGC ***** ******** *** * ** ** **** * * ***** ***** * ** * *

Mn12262 ------AAAGCACCGCAAGCAACGATGG------TGCAGACCCGGGCGAGCCAGGAGCTG Mm10095 ------AAAATACTGCAAGCAATGATGG------TGCGGACCCGAGCGAGCCAGAAGGAG Mm99049 ------AAAATACTGCAAGCAATGATGG------TGCGGACCCGAGCGAGCCAGAAGGAG Mn11037 GGCAGCAAAGCACCGCAGGCAGCAGCAA------TACGGACCCGGGCGGGCCGGGGAGCG Mn10106 ------AAAGCACCGCAAGCAACGATGG------TGCAGACCCGGGCGAGCCAGGAGCTG Mb07020 ---AGCAGACTACCGCGGGCAGTGATGACCATCATAAGGGCGCGGGAAGGCCCAGTGGCG * * ** ** *** * * * ** * *** *

Mn12262 ATGTGAA---AGAGGACGATGCATACCTGCAGGCCTCCCACGGACGAC------Mm10095 GCGCAATCACAGAGGACGGCGCAATCCAGCAGGTCTCCCACGGACAGC------Mm99049 GCGCAATCACAGAGGACGGCGCAATCCAGCAGGTCTCCCACGGACAGC------Mn11037 GTATGGG---AGACGACGGCACACCGTTGCCAATCTCGCAGAAGCACCAGCAGCAGCATC Mn10106 ATGTGAA---AGAGGACGATGCATACCTGCAGGCCTCCCACGGACGAC------Mb07020 ATGTGAC---AGACGACGGCACACACTCTGCACGCCAGCAGCAACAGCAGC------*** **** ** * ** * *

Mn12262 ------GACCACGAC------GGCAGCACAAGACAGTCATAATATGGTTGCATCACC Mm10095 ------GACCACGACCACAACGACAGCACAAGACGGTCATAATATGGCTGCACCATC Mm99049 ------GACCACGACCACAACGACAGCACAAGACGGTCATAATATGGCTGCACCATC Mn11037 GGCAAGCAAGACGACGACGACCACGGCAGCACAAGACGGTCATAATATGGCTGCACCACC Mn10106 ------GACCACGAC------GGCAGCACAAGACAGTCATAATATGGTTGCATCACC Mb07020 ------AGCAGCAGCCAAAACGACAGCATAAAACGGTGATAATTTGGCTACACCACC * * * * ***** ** ** ** ***** *** * ** ** *

Mn12262 TTCTTAACACCAATAAGCGGAAGCTAGCGCTGCATCCGACCCTGCAGCCGACTTC----- Mm10095 TCCTCAACACCAACAAGCGGAAGCTCGCGCTGCATCCGACCCTGCAGCCGCCATC----- Mm99049 TCCTCAACACCAACAAGCGGAAGCTCGCGCTGCATCCGACCCTGCAGCCGCCATC----- Mn11037 TGCTCAACACCAACAAGCGCAAGCTCGCGCTGCACCCAACCCTACAGCCACCTTCACCAC Mn10106 TTCTTAACACCAATAAGCGGAAGCTAGCGCTGCATCCGACCCTGCAGCCGACTTC----- Mb07020 TGCTCAACACCAACAAGCGGAAGCTCGCGCTGCACCCGACTCTCCAGCAGTC------* ** ******** ***** ***** ******** ** ** ** **** *

Mn12262 ----TTTGCA------CCTCCAGTCCTCG------ACCACTGCCGCAACCG Mm10095 ----CTTGCA------CGTCCAGCCCTCG------ACCACCACCACAACCG Mm99049 ----CTTGCA------CGTCCAGCCCTCG------ACCACCACCACAACCG Mn11037 CACCCCCGCAGTCTCAATCTCACCAGTCCTCGGGTACCACCATCACCACCACTACTACAG Mn10106 ----TTTGCA------CCTCCAGTCCTCG------ACCACTGCCGCAACCG Mb07020 ----CCTGCC------CCATACCTCA------CAC-ACAGCTCCCG ** *** **** *** * * * *

Mn12262 TCGCCGCTGGCAATGTGATTTCAGGCGTCACGAAGCCAGGATACCCAGGCGTCCTGCTGT Mm10095 CCGTCGCTGGTAGTGAGATTACAGGCATTACGAAGCCAGGATACCCGGGCGTCCTGCTGT Mm99049 CCGTCGCTGGTAGTGAGATTACAGGCATTACGAAGCCAGGATACCCGGGCGTCCTGCTGT 232

Mn11037 CCGCCGCTGGTGGGATGATCACGGGCGTCACGAAGCCGGGATACCCGGGCGTCCTGCTGT Mn10106 TCGCCGCTGGCAATGTGATTTCAGGCGTCACGAAGCCAGGATACCCAGGCGTCCTGCTGT Mb07020 CCACGCCGAGCAGAGGGATCACGGGTGTCACGAAACCAGGATACCCAGGGGTCCTACTGT * * * *** * ** * ***** ** ******** ** ***** ****

Mn12262 ACTCAGGCCCGCGGGACCTCGTGGCCGCGCACGTGTCGGAGCTGCGGGGCCAGCGGTGGC Mm10095 ACTCGGGCCCACGGGACCTCGTGGCTGCGCACGTGTCGGAGCTACGGGGCCAGCGATGGC Mm99049 ACTCGGGCCCACGGGACCTCGTGGCTGCGCACGTGTCGGAGCTACGGGGCCAGCGATGGC Mn11037 ACTCGGGCCCGCGGGACCTCGTGGCCGCGCACGTGTCCGAGCTGCGGAGCCAGCGGTGGC Mn10106 ACTCAGGCCCGCGGGACCTCGTGGCCGCGCACGTGTCGGAGCTGCGGGGCCAGCGGTGGC Mb07020 ACTCGGGGCCGAGGGACCTGGTAACCGCGCACGTCGCCGAGCTGCGGACGCAGCGGTGGC **** ** ** ******* ** * ******** * ***** *** ***** ****

Mn12262 AGGCGTTCCAGGTGCGGTACGACAGCGACGACTACGACG---ACCTGGACCTTGACGCCG Mm10095 AGGCGTTCCAGGTGCGGTACGACAGCGACGACGATGATG---CCCTTGACCTTGATGCCG Mm99049 AGGCGTTCCAGGTGCGGTACGACAGCGACGACGATGATG---CCCTTGACCTTGATGCCG Mn11037 AGGCATTCCAGGTACGGTACGACAGCGACGATGACAATGTTGACATTGACGACGACTTCA Mn10106 AGGCGTTCCAGGTGCGGTACGACAGCGACGACTACGACG---ACCTGGACCTTGACGCCG Mb07020 AGGCGTTCCAGGTGCGGTACGACAGCGACGACGATA------TCCTG--CTGGACGCCG **** ******** ***************** * * ** *

Mn12262 GCGCC---GAGGAGGGCGGCGGGAACAAAGG------Mm10095 GCGCC---GACGAAGGCGGCGGGGACAAAGA------Mm99049 GCGCC---GACGAAGGCGGCGGGGACAAAGA------Mn11037 AGGACTGGGACAAGAGCGGTGGGGAGGAGGAGGAGGAGGAGGAGGATGAGGCCGCCAGCA Mn10106 GCGCC---GAGGAGGGCGGCGGGAACAAAGG------Mb07020 CCGGC------AAGGAGGGCGGGAAGG------* * * ** *** *

Mn12262 ------CCGTATAGTCGACGCCGATTCTCAGAAGA Mm10095 ------CCGTAAATTCTACGCCGAGTCTCGCAAGA Mm99049 ------CCGTAAATTCTACGCCGAGTCTCGCAAGA Mn11037 ACCACCACTACCACCACCACCACCACCACCACCATGCAGCCGAACCCGAGTCTCACAAGG Mn10106 ------CCGTATAGTCGACGCCGATTCTCAGAAGA Mb07020 ------GTGCGGCCCACGTCAAGACACTCAAGA * * * * * * * ***

Mn12262 ATGACAGGGTAAAAGGTCGTCAGTCGTGGA---CAAGGGGCTTGACGGTCAGTACACGCC Mm10095 ATGACGAGGTGAAAAGTCATCAATCGTGGA---CAAAGAGCTCAGCGGTCAGTACATACC Mm99049 ATGACGAGGTGAAAAGTCATCAATCGTGGA---CAAAGAGCTCAGCGGTCAGTACATACC Mn11037 GCGGAAGTGGGAAAAGCCGTCAGACGTGGAAAACAAAAGGCACGGCGAGTGGTGCTCGCC Mn10106 ATGACAGGGTAAAAGGTCGTCAGTCGTGGA---CAAGGGGCTTGACGGTCAGTACACGCC Mb07020 GCGGAGAGATCAAAGGCCAGCAAGCTAGGA--GCGGGACGGCGGGTAAAAAG-GCCTGCG * *** * * ** * *** * * * * *

Mn12262 AGGACGCAGTGCTGCCGGGACTGTGGACGTTCGAACACGCCCCCGGCACCATTGCAGAGG Mm10095 AGGACGCAGTGCCGCCGGGACTATGGACGTTCGAACACGTTCTTGGCACCATTGTAGAGG Mm99049 AGGACGCAGTGCCGCCGGGACTATGGACGTTCGAACACGTTCTTGGCACCATTGTAGAGG Mn11037 CACACGCTGTGCCGCCGGGACTGTGGCTGTTCGAGCACGGCCCCGGCACCATTGTAGAGC Mn10106 AGGACGCAGTGCTGCCGGGACTGTGGACGTTCGAACACGCCCCCGGCACCATTGCAGAGG Mb07020 ATAGCGTGCTGCCGCCTGGGCTCTGGGAGTTCGAGCACGGCCCGGACACCATCGTCGAGG ** *** *** ** ** *** ****** **** * * ****** * ***

Mn12262 CGGAGTCGATTGCGCTGGCGGCAAGAAGCATTAGACAGGAGAGTCACAGGGAGATATTCC Mm10095 TGGAGTCAATTGCGCTGGCGGCAACGAGCATCAGGCAGGAGAAGCACAGGGAGATATTCC 233

Mm99049 TGGAGTCAATTGCGCTGGCGGCAACGAGCATCAGGCAGGAGAAGCACAGGGAGATATTCC Mn11037 TAGAGTCAATCGCACTGGCGGCGAGAAGCATTACAGAGGAGAGGCACAGGGAGATATTCC Mn10106 CGGAGTCGATTGCGCTGGCGGCAAGAAGCATTAGACAGGAGAGTCACAGGGAGATATTCC Mb07020 TGGAGTCGATCTCGCTGGCGGCGAGGAGCATCAAGGTGGAGAAGCATAGGGAGACGTTCC ***** ** * ******** * ***** * ***** ** ******* ****

Mn12262 TTGCCGCAGTTGGAGTCAAG Mm10095 TCGCCGCAATTGGAGTCAAA Mm99049 TCGCCGCAATTGGAGTCAAA Mn11037 TCGCCGCGATTGGGGTCAAG Mn10106 TTGCCGCAGTTGGAGTCAAG Mb07020 TGGCTGTGATCGGGGTAAAG * ** * * ** ** **

234

Chapter 4 Population Genetics

4.1 Introduction

4.1.1 Genetic diversity Population genetics seeks to describe the fluctuation in the genetic makeup of a population over time (Hartl and Clark 2007). This change in the relative frequencies of different alleles among the population is influenced by factors that include the size of the population, the mating frequency, and the mutation rate. Broadly, a population with a large number of different alleles has higher genetic diversity relative to a population consisting of genetically-similar individuals. Among pathogens, genetic diversity represents an important reservoir of tools that may help a pathogen to attack a wider range of hosts, to resist certain fungicides or plant defenses, and to survive in diverse conditions (Agrios 2005). Although genetic diversity among a species as a whole may be important for the long-term persistence of the pathogen, within a particular population of an agricultural pest (e.g. within a single field), genetic diversity may be restricted due to strong selective pressure from fungicides or from plants that have been bred for resistance (Zhan et al. 2003).

In a large, randomly-mating population with low levels of mutation, selection pressure, and incoming gene flow, the assortment of alleles in a generation of offspring is random, based only on the frequencies of each allele in the parent population (Hartl and Clark 2007). This is known as the Hardy-Weinberg (HW) principle (Hartl and Clark 2007). This model may be useful in predicting gene flow within a population. However, the assumptions inherent in the HW principle may be violated in real populations for several reasons: for example, loci that are physically proximate tend to move together during phenomena such as crossing-over, which

235

shuffle the parental alleles. The relative frequencies of two such alleles in an offspring population are non-independent. Similarly, major changes in genome organization such as inversions prevent recombination within the inverted segment during meiosis (Hartl and Clark

2007). Linkage disequilibrium occurs when alleles are inherited non-randomly for these or other reasons (Hartl and Clark 2007), and thus can occur within individuals or populations that are subjected to HW conditions in other respects.

Another cause of linkage disequilibrium is a reduced rate of sexual reproduction and / or an increased frequency of asexual reproduction (Hartl and Clark 2007). Clonal reproduction may preclude recombination and can result in the inheritance of a full suite of alleles as a single unit.

Although large asexual populations with a high mutation rate may have allele frequencies similar to those present in populations with sexual recombination (Chasnov 2000), small populations that reproduce primarily or solely asexually violate the assumptions of HW equilibrium because the decreased frequency (or absence) of recombination causes allele re-assortment to become non-random. As a result, linkage disequilibrium may be present in such populations.

4.1.2 Mechanisms of generating genetic diversity In fungi, sexual reproduction (Chapter 5), horizontal gene transfer (Mehrabi et al. 2011), and random mutation (Agrios 2005) may all produce novel combinations of alleles in each generation. The latter two processes are particularly important among fungi that rarely (or perhaps never) reproduce sexually (Webster and Weber 2007). Horizontal gene transfer occurs when the somatic hyphae of separate strains fuse and exchange nuclei (Saupe 2000). The DNA from these nuclei may undergo recombination, thus producing offspring with novel genetic combinations in the absence of meiosis or a sexual cycle (Webster and Weber 2007). The ability

236

of two strains to participate in horizontal gene transfer is dependent upon their vegetative compatibility group (VCG) identity, which in turn is defined by the alleles present at several loci in the genome (Saupe 2000). Regardless of whether they are compatible, when two hyphae meet, anastamosis (cell fusion) occurs. In the case of compatible hyphae (belonging to the same VCG), nuclei may be exchanged but in incompatible pairings, the fused cells die and no exchange occurs (Saupe 2000). Compatible pairings (resulting in nuclear exchange) have been reported to occur even across species (Friesen et al. 2006), and this form of genetic exchange has been proposed as the mechanism that was responsible for the transformation of the previously non- virulent Stagonospora nodorum into a serious pathogen of wheat (Friesen et al. 2006).

The movement of transposable elements (or transposons) is another mechanism that may generate novel combinations of sequence elements without sexual recombination. Transposable elements are fragments of DNA that can change their position within a genome (Hartl and Clark

2007). Although widely present in the genomes of both prokaryotes (Hartl 1992) and eukaryotes, including fungi (e.g. (Parlange et al. 2011; Torriani et al. 2011)), there is also evidence that transposons can move between organisms, including between distantly related species (Renner and Bellot 2012; Schaack et al. 2010). In this manner, transposons may transfer genetic information both within and between species. In addition, the mechanism of insertion and excision employed by some groups of transposons is imprecise and may result in the transfer of genetic information that was not originally part of the transposon (Yu et al. 2000). Transposition is thus another mechanism through which horizontal gene transfer may occur. The contribution of transposable elements to genetic diversity and pathogenicity is further described in Chapter 3.

237

4.1.3 Repetitive DNA sequences and genetic diversity To assess the level of genetic variation within a population of fungi, several different tools are available. Repetitive sequences, which are present in varying numbers and various lengths, are one important tool that have been used by researchers (Jarne and Lagoda 1996).

After the double-helix structure of DNA was elucidated, researchers discovered that, under conditions that mimic the physiologic environment, double-stranded DNA most commonly assumes a conformation known as B-DNA consisting of a right-handed double helix with 10.5 nucleotides per turn of the helix (Wang 1979). However, other conformations, including a left-handed helix with 12 nucleotides per turn (Z-DNA), have also been recognized for over 30 years (Rich et al. 1984). Researchers observed that Z-DNA was associated with repetitive sequences consisting of alternating pyrimidines and purines (or vice-versa) (Arnott et al. 1980), and short spans of the repeated pattern G-T were detected in Homo sapiens (Hamada and Kakunaga 1982), Mus musculus (mice) (Nishioka and Leder 1980), Drosophila sp.

(Nordheim et al. 1981), and Bos taurus (calf) and Oncorhynchus sp. (salmon) (Hamada and

Kakunaga 1982). These simple sequence repeats (SSRs), also commonly known as microsatellites or short tandem repeats (Tautz 1993) are widely found in genomic DNA, including in both coding and in non-coding regions (Katti et al. 2001). These regions consist of a simple pattern of 1-6 nucleotides, repeated between five to approximately one hundred times at each locus, with several thousand such loci found within a single genome (Tautz 1993). The length of these regions is most likely modulated by errors in DNA replication, including both the

"slippage" of DNA polymerases (Levinson and Gutman 1987) and errors in the normal error correction mechanisms (Strand et al. 1993). The large number of such loci in eukaryotic genomes, in addition to the variable length of each individual repeat of a given sequence, makes

238

SSRs an ideal tool for studying genetic diversity. The amplification of a particular SSR from a single individual would be expected to produce multiple bands (corresponding to the multiple loci), each of differing length (reflecting the variable number of repeats at each locus) (Hartl and

Clark 2007).

Inter-Simple Sequence Repeats (ISSRs) are the regions located between SSRs (Figure

4.1). Whereas the study of SSRs requires prior knowledge of the genetic sequence of the regions flanking the repetitive sequences (Jarne and Lagoda 1996), the primers to amplify ISSRs are anchored in the more widely-conserved repetitive sequences (McGregor et al. 2000). Because the number of SSRs is the genome is highly variable, the number and the length of the sequences between them are also variable (McGregor et al. 2000). Together, these factors make ISSR analysis useful for population studies even among groups where prior genomic information is unavailable (McGregor et al. 2000).

Analyses using both SSRs and ISSRs have been extensively used to assess genetic diversity in plants (e.g. (Fernández et al. 2002; Galván et al. 2003; Nybom 2004; Patzak 2001;

Reddy et al. 2002)) and fungi (e.g. (Karaoglu et al. 2004; Mishra et al. 2003; Wang et al. 2005).

Isolates of the plant pathogenic Fusarium culmorum were assessed using ISSR (Mishra et al.

2003). Similar to M. nivale, this pathogen had been classified as asexual, but was known to possess a high level of phenotypic variation. Investigation of ISSR banding patterns for isolates collected from Europe and North America led these researchers to propose that the geographic populations of this fungus are distinct, and that the variability observed is consistent with recombination (Mishra et al. 2003). Similarly, ISSR analysis revealed genetic variation between isolates of Beauveria bassiana and three other Beauveria spp. collected from different geographic regions, but not from different host plant species (Wang et al. 2005). 239

When ISSRs were used to investigate differences among isolates of Fusarium poae collected in Argentina and England, the researchers identified a high level of genetic diversity even among isolates collected from the same batch of seeds (Dinolfo et al. 2010). In addition, the isolates did not resolve into two groups based on their country of origin; instead, sub-groups consisting of isolates from a single geographic origin were scattered throughout the dendogram constructed to visualize the results. Even among isolates of the asexual bean pathogen

Pseudocercospora griseola collected from a single population, ISSR analysis identified a unique haplotype for each of the isolates tested (Abadio et al. 2012).

The analysis of ISSR has been used to assess the contribution of recombination to genetic diversity within a small population of phytopathogenic fungi with a known sexual cycle.

In a recent investigation of Rhizoctonia solani isolates collected from a single field, Zheng and colleagues suggested that this population likely spreads asexually based on the calculated index of association (2013).The use of markers related to microsatellites provides an economical tool that allows researchers to investigate the genetic diversity among fungal populations.

4.1.4 Genetic diversity in M. nivale Outside of the genetic, morphological, and pathogenic differences reported between M. nivale and M. majus (Chapters 2 and 3), several studies have demonstrated variation within M. nivale as well. A study comparing isolates of M. majus and M. nivale originally collected on

European wheat found that, relative to M. majus, the RAPD profiles of M. nivale were highly diverse (Lees et al. 1995). Maurin and colleagues examined the ITS RFLP patterns among isolates of M. nivale sensu lato collected from several host species and from across Europe

(1995). This comparison revealed two distinct groups that corresponded to M. nivale and M.

240

majus. Protein electrophoresis revealed diverse esterase profiles in all twenty of the M. nivale isolates examined in this study; by comparison, all seven of the M. majus isolates examined possessed the same esterase profile (Maurin et al. 1995). Similarly, in a study comparing the EF-

1α sequences of M. nivale to those of M. majus, researchers noted a slightly higher level of heterogeneity among the M. nivale sequences than those of M. majus (Glynn et al. 2005). The wider host range of M. nivale relative to M. majus (Simpson et al. 2000) may provide a possible explanation for the higher variability reported for this species. A study conducted in 1998

(Mahuku et al.) investigated isolates of M. nivale collected from four sites consisting of three different turf grass hosts. Using RAPD and RFLP, a high level of genetic variation was observed.

The two populations collected from the same host species displayed higher similarity to each other than to either of the other populations despite being collected about 20 km apart.

To explain the high level of genetic variation in M. nivale relative to the closely-related

M. majus (Chapters 2, 3), some researchers have suggested that M. nivale may undergo sexual reproduction more frequently than M. majus (Mahuku et al. 1998). However, formation of perithecia has been observed more frequently in M. majus than in M. nivale (Lees et al. 1995).

Although the studies described above have reported genetic variation of isolates from several different locations, none have reported re-sampling from the same location across multiple years to determine how rapidly, if at all, genetic variation may fluctuate within a single population. A high turnover in genotype frequency would suggest a sexually recombining population, whereas a high overall level of diversity that is relatively unchanged from year to year could instead suggest an established population of variable individuals that reproduce predominantly or exclusively asexually.

241

A detailed investigation of the factors that may influence the presence of Hardy-

Weinberg equilibrium within Microdochium nivale and M. majus has not been conducted.

However, selective pressure imposed by fungicide use against M. nivale on turfgrasses and the apparent host specificity exhibited by both species (Chapter 3) may play an important role in their genetic diversity.

4.1.5 Objectives The objective of this project was to assess genetic variation in local populations of M. nivale separated temporally and spatially, with the intention of using these results to assess whether sexual reproduction may be occurring within the populations studied.

4.2 Materials and Methods

4.2.1 Sample collection and DNA extraction Immediately following snowmelt in 2011, 2012, and 2013, samples of grass displaying symptoms of pink snow mold and / or Fusarium patch (Figure 4.2) were collected from various grasses at the Guelph Turfgrass Institute (GTI). The sample collection locations are indicated in

Figure 4.3, and a list of the total number of samples obtained at each location is found in Table

4.1. The grass samples collected were processed as described in Section 2.2.1 to obtain single isolates per sample. Genomic DNA was extracted from all of the isolates using the modified

Edwards protocol (Edwards et al. 1991) as described in Section 2.2.2, and the quality and quantity of the DNA obtained was assessed by electrophoresis as described Section 2.2.2.. The

DNA was diluted as necessary with dH2O to achieve a concentration of 0.1 ng/µL prior to use in

PCR.

242

4.2.2 SSR primer design Primers for SSR were designed using the default settings of the program QDD (Meglécz et al. 2010) and using the genome of M. nivale isolate 11037 (Chapter 3). Twelve primer sets with penalty values below 7 and a difference in Tm of less than 0.5 °C were selected randomly for synthesis and testing. The SSR primers were synthesized by BioBasic (Markham, Ontario), and all of the primer sets ordered were screened using a PCR protocol identical to that described for the β-tubulin PCR reactions (Chapter 2), with the exception that the annealing temperature was first tested at 50 °C and was raised as necessary to reduce streaking. Primers used in the study are listed in Table 4.2.

4.2.3 ISSR and SSR PCR protocols All ISSR and SSR PCRs were conducted using the same BioRad MyCycler thermocycler

(BioRad, Hercules, California). The ISSR primers were obtained from either the local lab collection of ISSR primers or from the literature. All ISSR primers were synthesized by

Laboratory Services, University of Guelph (Guelph, Ontario). A list of the primers tested is found in Table 4.2. Each ISSR PCR reaction contained approximately 0.1 ng of fungal DNA (0.1 ng/μL) with a final concentration of 1.2X PCR buffer, 0.22 mM Mg2+ (BioBasic, Markham,

ON), 1.5 mM dNTP mixture (prepared using individual solutions of dATP, dTTP, dCTP, and dGTP; BioBasic, Markham, ON), 0.6 μM of each the forward and reverse primers, 0.04 U of Tsg

DNA polymerase (BioBasic, Markham, Ontario), and enough sterile water to bring the total volume of the master mix to 20 μL per reaction.

The ISSR primers were screened by testing the DNA from ten samples collected in 2011

(five each from the "roadway" and the "pathology green fringe" locations). The reaction mixture

243

was submitted to a thermal cycling procedure consisting of an initial denaturation at 94 °C for

1.5 minutes followed by 35 repetitions of a denaturation period at 94 °C for 40 seconds, an annealing period of 45 seconds at 45 °C, followed by an extension period of 1.5 minutes at 72

°C. The mixture was then subjected to a final extension period of 5 minutes at 72 °C before being held at 15 °C until the sample was retrieved and stored in the refrigerator at 4 °C.

Those primers that yielded at least one band that was present in at least one, but not all of the isolates tested, were selected for further screening. The annealing temperatures for these primers were optimized using a gradient PCR. The temperatures included in the gradient were

42.0, 42.7, 44.0, 46.0, 48.6, 50.7, 52.1, and 53.0 °C, and higher annealing temperatures were tested if streaking was observed. The annealing temperatures for the primers selected for use with all isolates are found in Table 4.3.

For SSR, the PCR reaction mixture was prepared as described for the ISSR reactions.

Based on their consistent production of polymorphic bands from a selection of isolates from each collection location from 2011 and 2012, five SSR primers were used to amplify DNA isolated from all of the samples included in this study (Table 4.3).

4.2.4 Result scoring and analysis To visualize the results of the test SSR and ISSR PCRs, 1.2% agarose gels were prepared using 0.5X TBE buffer. A 3:1 mixture of PCR product and loading buffer (Chapter 2) was loaded onto the gel and the samples were subjected to electrophoresis in 0.5X TBE buffer at 50

V for approximately 60 minutes before being stained with an ethidium bromide solution and visualized under UV light. For the PCRs that included all isolates from a given year, large, 72- well 1.2% agarose gels were instead subjected to electrophoresis at 85 V and 35 mW for 4 h.

244

Commercial DNA ladders (both 100bp and 1kb) (described in Chapter 2) were also loaded in each gel prepared in 2011 and 2012 so that the sizes of the bands could be estimated. For the

2013 samples, the ladders used were a 100 bp ladder (BioBasic, Markham, ON), which hand bands ranging from 0.1 to 1.5 kb in length, and the Lambda DNA / HindIII plus marker

(BioBasic, Markham, ON), which had bands ranging from 0.3 to 23 kb in length.

Each gel was photographed electronically and the presence or absence of bands for all of the samples amplified with the eight ISSR and SSR primers or primer sets were scored in a binary matrix. Scoring was performed by eye. The data for all primer sets were combined and were used to construct a UPGMA tree by using the program WinDist (Yap and Nelson 1996) to calculate a distance matrix, and the "neighbor" function of PHYLIP v. 3.69 (Felsenstein 1993) to translate this matrix into a tree. Bootstrap values were calculated using WinBoot (Yap and

Nelson 1996). Trees were visualized and saved as image files using Archaeopteryx (Han and

Zmasek 2009), and bootstrap values (calculated by WinBoot) were added to the figure in

PowerPoint.

Tests for linkage disequilibrium were conducted using two different methods. An exact test for linkage disequilibrium for haplotypic data was conducted using Arlequin v. 3.1

(Excoffier et al. 2005). The second test was conducted using the Index of Association (IA)

(Smith et al. 1993) to compare the observed variance in the observed number of mismatches between isolates (Vobs) with the expected variance (Vexp) of mismatches based on allelic frequencies for a randomly recombining system without linkage disequilibrium. The IA is calculated as:

IA = (Vobs / Vexp) - 1 (equation 4.1)

245

If IA is larger or smaller than 0, linkage disequilibrium is present in the system. For this study, if the absolute value of IA was greater than 2, linkage disequilibrium was scored as present. To assess the variation in IA, 100 bootstrap replicates were conducted to generate data sets with the same number of isolate and allele frequencies for each locus but with random assignments of alleles to isolates without the assumption of linkage (Feil et al. 1996). This test

(the calculation of IA and bootstrap testing) was implemented using the program DISEQUIL

((Mahuku et al. 1998); program available at: www.uoguelph.ca/~thsiang/pubs/supplement/ disequilibrium/disequil.zip). Using both methods, linkage disequilibrium was assessed separately within each "by-year" group, "location" group, and "year and location" group. An example of the input file for Arlequin is found in Appendix 4.1. For fully clonal isolates (i.e. groups of isolates with identical banding patterns across all eight primers), only one representative was included in the input for these calculations.

4.3 Results

4.3.1 ISSR primer screening A total of 12 ISSR primers previously used in the lab and from the literature were obtained or synthesized and were tested with five isolates for each of the two locations from the

2011 collection year. Eleven of these primers produced at least a single band with each isolate, and the PCR conditions were further optimized by performing a gradient PCR. When the primers were tested with the samples from 2012 and 2013, only three of the eleven primers produced polymorphic bands. For this reason, only the primers BHY(AGC)5, (GA)8T, and DD(CCA)5 were used to amplify all of the DNA samples. These primers yielded between zero to eight bands

246

per isolate (Appendix 4.2). One of the primers ((CT)5RG) did not amplify any of the isolates tested.

The primer BHY(AGC)5 produced 12 polymorphic bands, ranging in size between 0.4 to

2.1 kb. The number of bands observed for any single isolate ranged between zero to a maximum of eight, with an average of five bands per isolate. The primer (GA)8T produced 19 polymorphic bands, ranging in size between 0.26 to 2.2 kb. The number of bands observed for any single isolate ranged between zero to six, with an average of three bands per isolate. The primer

DD(CCA)5 produced 19 polymorphic bands, ranging in size between 0.4 to 2.2 kb in size. The number of bands observed for any single isolate ranged between zero and eight, with an average of five bands per isolate.

4.3.2 SSR primer design and screening Using the program QDD, a total of 41,770 primer sets were designed. Twelve of the primer sets were ordered and were tested with a subset of five isolates from each of the two locations and from both the 2011 and 2012 collection years (i.e. a total of 20 isolates). Five of these primer sets produced at least one polymorphic band and were used to amplify the DNA from all of the isolates. The primers used to amplify all of the isolates were 961061, 910478,

(CT)5, (GAT)6, and (GC)5, and these yielded between zero to eight bands per isolate (Appendix

4.2).

Primer pair 961061 produced 16 polymorphic bands, ranging in size between 0.17 to 3.0 kb in length. The number of bands observed for any single isolate ranged between zero and eight, with an average of four bands per isolate. Primer pair 910478 produced 17 polymorphic bands, ranging in size between 0.07 to 5.0 kb in length. The number of bands observed for any single

247

isolate ranged between zero and ten, with an average of four bands per isolate. Primer pair (CT)5 produced 23 polymorphic bands, ranging in size between 0.09 to 2.5 kb in size. The number of bands observed for any single isolate ranged between zero and nine, with an average of four bands per isolate. Primer pair (GAT)6 produced 10 polymorphic bands, ranging in size between

0.2 to 1.3 kb in size. The number of bands observed for any single isolate ranged between zero and five, with an average of two bands per isolate. Primer pair (GC)5 produced 20 polymorphic bands, ranging in size between 0.075 to 5.0 kb in size. The number of bands observed for any single isolate ranged between zero and nine, with an average of three bands per isolate.

When the isolates were compared using all of the polymorphic SSR and ISSR bands produced, only two clonal isolates (collected from the pathology green fringe in 2011) were identified. However, within the results for each primer, several isolates shared identical banding patterns. A table summarizing these relationships for each primer is found in Appendix 4.2.

4.3.3 Linkage disequilibrium calculations In the linkage disequilibrium calculations performed by Disequil (Table 4.4), linkage disequilibrium was detected among all of the isolate groups collected at each location and in each of the three years, with the exception of the six samples collected near the pathology green in 2011. Linkage disequilibrium was also calculated in Arlequin, using a p value of 0.05. By default, Arlequin assesses only those alleles which are polymorphic within each of the groups defined.

In the 2011 samples, linkage disequilibirum was not found among any of the 26 polymorphic loci from the six pathology green samples, but was detected in 20 of the 37 polymorphic loci from the roadside samples. For the 46 samples from 2012 and the 74 samples

248

from 2013, linkage disequilibrium was detected among the polymorphic loci of all samples

(Table 4.4).

4.3.4 Year-to-year and location-to-location comparisons The bootstrapped UPGMA dendogram (Figure 4.4) revealed some trends among the yearly isolate groups from isolates collected in different years. For the 2013 collection, 47 of the

74 total isolates were grouped together in 50% of the trees produced. An additional 25 isolates formed a separate group in 32% of the trees. The remaining two isolates were grouped together with 100% bootstrap support. Among the 2012 isolates, 45 of the 46 isolates formed a single group with 58% bootstrap support. The remaining 2012 isolate was external to the group containing the other 2012 isolates, but in 29% of the trees produced, it was present in a sister group relative to the group containing the larger "2012" group. In 95% of the trees, all of the

2011 isolates were grouped together. In addition to these within-year trends, the 2013 isolates were split into two large groups of unequal size as described above and one much smaller group consisting of just two isolates. The largest group was a sister group to the 2011 group in 39% of the trees examined. The 2013 two-isolate group was a sister group to the 2012 isolates in 14% of the trees examined, and the final group of 2013 isolates were clustered with the larger group that contained both the 2012 isolates and the 2013 isolate pair in 20% of the trees sampled.

As a whole, the isolates collected from the two locations did not group together, but there were some trends for the samples collected at the two locations within each year. Within the

2013 isolates, 28 of the 43 "pathology green" isolates formed a paraphyletic group with 23% bootstrap support; an additional 6 isolates formed a single group with 98% bootstrap support .

Among the 31 "roadside" isolates collected in 2013, two separate groups of four isolates formed

249

distinct groups, each with 100% bootstrap support; an additional seven 2013 "roadside" isolates were grouped together with 77% support. The remaining isolates from both locations formed polyphyletic groups.

For the isolates collected in 2012, all but one of the 17 "roadside" samples formed a group with 48% bootstrap support. The 25 "pathology green" 2012 isolates formed a single group with 54% bootstrap support. Among the 2011 isolates, three of the five "pathology green" isolates formed a single group with 22% bootstrap support. There were no strongly-supported

(i.e. bootstrap values above 50%) non-polyphyletic groups containing the nine 2011 "roadside" isolates.

4.4 Discussion

In this Chapter, the genetic diversity of two proximate populations of M. nivale was assessed yearly during a three-year period. The goal of these experiments was to determine whether genotypes persisted from year to year and to explore whether recombination may be occurring in these populations. The isolates that were examined in these experiments were collected from two proximate patches of Kentucky bluegrass at the Guelph Turfgrass Institute, immediately following snowmelt in 2011, 2012, and 2013.

At the beginning of the study, the first tools that were used to assess the genetic diversity among these samples were ISSR markers. A set of 12 primers were tested, but only three ultimately yielded polymorphic bands for the isolates under study. One primer failed to amplify any of the isolates tested while the other eight primers did not yield polymorphic bands (but did amplify at least one band from the isolates tested). Among the three primers that did yield

250

polymorphic results, either 12 or 19 polymorphic bands were produced. These bands were scored as polymorphic because they were observed in at least one, but not all of the isolates tested.

Because only three ISSR primers yielded polymorphic bands, SSR primers were designed using the genome of M. nivale isolate 11037 and the program QDD to further explore the genetic diversity within the populations of interest. The program identified over 46,000 possible SSR primer pairs from the M. nivale genome, which is consistent with the results obtained from the similarly-sized genomes of other Ascomycetes (Fang Shi and Mihaela Stanescu, personal communication). Although other researchers have used QDD to detect SNPs in fungi (e.g.

(Rouxel et al. 2012)), the number of putative SSR primers detected was not reported.

Of the 12 SSR primer sets ordered and tested, five were selected for use with all of the isolates collected based on their reproducible production of polymorphic bands. The remaining seven primers amplified at least one band from the isolates they were tested with, but failed to yield polymorphic results. This result is discussed in more detail below.

Previous research has reported the presence of high levels of genetic diversity among M. nivale, which has lead some researchers to suggest that sexual reproduction may be common within this species (e.g. (Mahuku et al. 1998)). A high level of genetic variation was also detected in this experiment, as a total of 130 polymorphic bands were produced by the eight

ISSR and SSR primers used in this study. In each year, isolates of M. nivale were collected from two locations. Although there was no overall pattern in the distribution of genotypes between the two locations, trends were detected both between and among the three years studied.

The isolate collections from two of the three years of this study formed single groups; the

2013 data, which split into two large, paraphyletic groups and one very small group consisting of just two isolates, was the exception to this observation. The two large groups of 2013 isolates 251

were grouped more strongly with the 2011 and the 2012 isolates than they did with each other, suggesting that there were distinct populations within the isolates from this collection year. This variability in the 2013 population was independent of collection location, as both of the large groups contained isolates from the two locations.

In addition to the differences observed between the years, within each year, the isolates originally collected from alongside the pathology green formed a separate group relative to the isolates collected from the roadside. This consistent pattern suggests that, despite the proximity of these two locations, there was restricted gene flow between these populations. In addition, the grouping of the isolates by year, and then by location (and not by location and then by year), implies that the year-to-year variability was higher than the location-to-location variability. The splitting of the 2013 isolates into two large groups that were clustered with the 2011 and the

2012 isolates, respectively, implies that new genetic information was introduced into the population of isolates between 2012 and 2013; however, it is difficult to extend this claim to the

2011 due to the small sample size. The low level of clonality detected also supports the hypothesis that asexual reproduction alone does not account for the variability observed. The most likely sources for new genetic information in these populations are sexual reproduction (i.e. novel combinations of alleles that were already in existence in the population), the infestation of the locations of interest with inoculum from an outside source (e.g. from neighbouring fields), or some combination of these two effects.

Although the overall level of clonality among all of the isolates studied was very low, within each primer or primer set there were several isolates which produced identical banding patterns (Appendix 4.2). The identical banding patterns were primarily found within groups of isolates from the same year and collection location, and suggest that, although not fully clonal, 252

large numbers of isolates did share "islands" of genetic information. These groups of shared alleles may also account for the linkage disequilibrium detected among most of the populations studied. Linkage disequilibrium was detected among all of the year-by-location groups with the exception of the "pathology green" samples from 2011. This non-random inheritance pattern could be caused by several factors including the physical proximity of the loci chosen for study or a lack of recombination among the population.

The pockets of similarity represented by these "islands" of identical banding patterns may explain why linkage disequilibrium was detected among each of the populations, because the banding patterns that were observed among more than three isolates were confined within a single year, and most often within a single collection location. The persistent grouping of certain alleles responsible for any particular banding pattern may imply that isolates sharing this pattern share a common parent; however, a hypothetical parent was not detected among the previous years' isolates that were included in this study.

The apparent linkage of many loci within the individuals tested also agrees with some of the ISSR and SSR results obtained during the primer testing phase. For both methods, the majority of the primers or primer sets tested (a total of 16 out of 24) failed to yield polymorphic bands. Instead, either identical banding patterns (15 of the 16 "rejected" primers) or no banding at all (one primer) were produced.

Taken together with the results from the primers that did yield polymorphic bands, the results from the non-polymorphic bands suggest that, although there was genetic variation within the populations studied, large portions of the genomes of the isolates sampled were not highly variable. The presence of a high degree of similarity, reflected through both the patterns of similarity within the "polymorphic" primers and through the identical results for all isolates with 253

the non-polymorphic primers, is broadly consistent with a largely asexual population that experiences infrequent sexual reproduction. Despite the high level of genetic variation reported within local populations of M. nivale (e.g. (Mahuku et al. 1998) and this study), the frequency of sexual reproduction within this species is still uncertain (Chapter 5). Although perithecia have been observed in vitro for this species, they have not been reported from the field (Smith 1983), suggesting that if M. nivale does undergo sexual reproduction, this process may be infrequent or may occur only within a narrow set of conditions (Smith 1983; Tronsmo et al. 2001). Even fungi that are not known to possess a sexual cycle can exhibit a high level of genotypic variation by

ISSR (Abadio et al. 2012); regardless, the "pockets" of genotypic similarity observed within each year in this study is similar to the overall trend observed between countries for populations of the homothallic pathogen F. poae studied by Dinolfo and colleagues (2010).

To further explore the trends observed within these experiments, more thorough sampling of both the two locations included in this study and the surrounding areas is necessary. A more detailed comparison between these isolates and those collected from the surrounding fields, which consist of a different turfgrass species, may elucidate possible sources for new genetic information. In addition, the presence of horizontal gene transfer in M. nivale was not examined, and may prove to be an important mechanism of gene transfer within this population.

The results presented in this chapter show that the genotypes present within two small populations of M. nivale were variable when sampled yearly across a period of three years using

SSR and ISSR markers. However, this variability may have been limited to only small portions of the genome, because linkage disequilibrium calculations and examinations of the banding patterns within primers or primer sets suggested that, despite possessing some differences, isolates that were ultimately ranked as possessing a unique genotype actually contained several 254

loci that were identical to those of other individuals within their year and / or location collection group. These results underscore the necessity of examining a large number of isolates for a study of this kind, and suggest that a low level of sexual reproduction may be occurring in M. nivale in the field. The ability of M. nivale to undergo sexual reproduction is further explored in Chapter

5, which also includes an investigation into the mating type sequences in both M. nivale and M. majus.

255

4.5 References for Chapter 4

Abadio, A.K.R., Lima, S.S., Santana, M., Salomao, T.M.F., Sartorato, A., Mitzubuti, E.S.G., Araujo, E.F., and de Queiroz, M.V. 2012. Genetic diversity analysis of isolates of the fungal bean pathogen Pseudocercospora griseola from central and southern Brazil. Genetics and Molecular Research 11(2): 1272-1279. Agrios, G.N. 2005. Plant Pathology. Elsevier Academic Press, Burlington, MA. Arnott, S., Chandrasekaran, R., Birdsall, D.L., Leslie, A.G.W., and Ratliff, R.L. 1980. Left- handed DNA helices. Nature 283: 743-745. Bayraktar, H., Dolar, F.S., and Maden, S. 2008. Use of RAPD and ISSR markers in detection of genetic variation and population structure among Fusarium oxysporum f. sp. ciceris isolates on chickpea in Turkey. Journal of Phytopathology 156: 146-154. Baysal, O., Siragusa, M., Gumrukcu, E., Zengin, S., Carimi, F., Sajeva, M., and Teixeira da Silva, J.A. 2010. Molecular characterization of Fusarium oxysporum f. melongenae by ISSR and RAPD markers on eggplant. Biochemical Genetics 48: 524-537. Bornet, B., and Branchard, M. 2001. Nonanchored Inter Simple Sequence Repeat (ISSR) markers: reproducible and specific tools for genome fingerprinting. Plant Molecular Biology Reporter 19: 209-215. Chasnov, J.R. 2000. Mutation-selection balance, dominance and the maintenance of sex. Genetics 156(3): 1419-1425. Dinolfo, M.I., Stenglein, S.A., Moreno, M.V., Nicholson, P., Jennings, P., and Salerno, G.L. 2010. ISSR markers detect high genetic variation among Fusarium poae isolates from Argentina and England. European Journal of Plant Pathology 127: 483-491. Edwards, K., Johnstone, C., and Thompson, C. 1991. A simple and rapid method for the preparation of plant genomic DNA for PCR analysis. Nucleic Acids Research 19: 1349. Excoffier, L., Laval, G., and Schneider, S. 2005. Arelequin ver. 3.0: An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online 1: 47- 50. Feil, E., Zhou, J., Smith, J.M., and Spratt, B.G. 1996. A comparison of the nucleotide sequences of the adk and recA genes of pathogenic and commensal Neisseria species: evidence for extensive interspecies recombination within adk. Journal of Molecular Evolution 43: 631- 640. Felsenstein, J. 1993. PHYLIP (Phylogeny Inference Package) version 3.69. Department of Genetics, University of Washington, Seattle. Fernández, M.E., Figueiras, A.M., and Benito, C. 2002. The use of ISSR and RAPD markers for detecting DNA polymorphism, genotype identification and genetic diversity among barley cultivars with known origin. Theoretical and Applied Genetics 104: 845-851. Friesen, T.L., Stukenbrock, E.H., Liu, Z., Meinhardt, S., Ling, H., Faris, J.D., Rasmussen, J.B., Solomon, P.S., McDonald, B.A., and Oliver, R.P. 2006. Emergence of a new disease as a result of interspecific virulence gene transfer. Nature 38(8): 1839-1842. Galván, M.Z., Bornet, B., Palatti, P.A., and Branchard, M. 2003. Inter simple sequence repeat (ISSR) markers as a tool for the assessment of both genetic diversity and gene pool origin in common bean (Phaseolus vulgaris L.). Euphytica 132: 297-301.

256

Glynn, N.C., Hare, M.C., Parry, D.W., and Edwards, S.G. 2005. Phylogenetic analysis of EF-1 alpha gene sequences from isolates of Microdochium nivale leads to elevation of varieties majus and nivale to species status. Mycological Research 109: 872-880. Hamada, H., and Kakunaga, T. 1982. Potential Z-DNA forming sequences are highly dispersed in the human genome. Nature 298: 396-398. Han, M.V., and Zmasek, C.M. 2009. phyloXML: XML for evolutionary biology and comparative genomics. BMC Genomics 10: 356-361. Hantula, J., Dusabenygasani, M., and Hamelin, R.C. 1996. Random amplified microsatellites (RAMS) - a novel method for characterizing genetic variation within fungi. European Journal of Forest Pathology 26: 159-166. Hartl, D.L. 1992. Popultion genetics of microbial organisms. Current Opinion in Genetics and Development 2: 937-942. Hartl, D.L., and Clark, A.G. 2007. Principles of Population Genetics, Fourth Edition. Sinauer Associates, Sunderland, Massachusetts. Jarne, P., and Lagoda, P.J.L. 1996. Microsatellites, from molecules to population and back. Trends in Ecology & Evolution 11: 424-429. Karaoglu, H., Man Ying Lee, C., and Meyer, W. 2004. Survey of simple sequence repeats in completed fungal genomes. Molecular Biology and Evolution 22(3): 639-649. Katti, M.V., Ranjekar, P.K., and Guta, V.S. 2001. Differential distribution of simple sequence repeats in eukaryotic genome sequences. Molecular Biology and Evolution 18(7): 1161- 1167. Lees, A.K., Nicholson, P., Rezanoor, H.N., and Parry, D.W. 1995. Analysis of variation within Microdochium nivale from wheat - evidence for a distinct subgroup. Mycological Research 99: 103-109. Levinson, G., and Gutman, G.A. 1987. Slipped-strand mis-pairing: a major mechanism for DNA sequence evolution. Molecular Biology and Evolution 4: 203-221. Mahuku, G.S., Hsiang, T., and Yang, L. 1998. Genetic diversity of Microdochium nivale isolates from turfgrass. Mycological Research 102: 559-567. Maurin, N., Rezanoor, H.N., Lamkadmi, Z., Some, A., and Nicholson, P. 1995. A comparison of biological, molecular, and enzymatic markers to investigate variability within Microdochium nivale (Fries) Samuels and Hallett. Agronomie 15(1): 39-47. McGregor, C.E., Lambert, C.A., Greyling, M.M., Louw, J.H., and Warnich, L. 2000. A comparative assessment of DNA fingerprinting techniques (RAPD, ISSR, AFLP and SSR) in tetraploid potato (Solanum tuberosum L.) germplasm. Euphytica 113: 135-144. Meglécz, E., Costedoat, C., Dubut, V., Gilles, A., Malausa, T., Pech, N., and J-F., M. 2010. QDD: a user-friendly program to select microsatellite markers and design primers from large sequencing projects. Bioinformatics 26(3): 403-404. Mehrabi, R., Bhahkali, A.H., Abd-Elsalam, K.A., Moslem, M., M'Barek, S.B., Gohari, A.M., Jashni, M.K., Stergiopoulos, I., Kema, G.H.J., and de Wit, P.J.G.M. 2011. Horizontal gene and chromosome transfer in plant pathogenic fungi affecting host range. FEMS Microbiology Reviews 35(3): 542-554. Mishra, P.K., Fox, R.T.V., and Culham, A. 2003. Inter-simple sequence repeat and aggressiveness analyses revealed in high genetic diversity, recombination and long-range dispersal in Fusarium culmorum. Annals of Applied Biology 143: 291-301.

257

Nishioka, Y., and Leder, P. 1980. Organization and complete sequence of identical embryonic and plasmacytoma kappa V-region genes. The Journal of Biological Chemistry 255: 3691-3694. Nordheim, A., Pardue, M.L., Lafer, E.M., Möller, A., Stollar, B.D., and Rich, A. 1981. Antibodies to left-handed Z-DNA bind to interband regions of Drosphila polytene chromosomes. Nature 294: 417-422. Nybom, H. 2004. Comparison of different nuclear DNA markers for estimating intraspecific genetic diversity in plants. Molecular Ecology 13(5): 1143-1155. Parlange, F., Oberhaensli, S., Breen, J., Platzer, M., Taudien, S., SImkova, H., Wicker, T., Dolezel, J., and Keller, B. 2011. A major invasion of transposable elements accounts for the large size of the Blumeria graminis f.sp. tritici genome. Functional and Integrative Genomics 11: 671-677. Patzak, J. 2001. Comparison of RAPD, STS, ISSR and AFLP molecular methods used for assessment of genetic diversity in hop (Humulus lupulus L.). Euphytica 121: 9-18. Reddy, M.P., Sarla, N., and Siddiq, E.A. 2002. Inter simple sequence repeat (ISSR) polymorphism and its application in plant breeding. Euphytica 128: 9-17. Renner, S.S., and Bellot, S. 2012. Horizontal gene transfer in Eukaryotes: fungi-to-plant and plant-to-plant transfers of organellar DNA. In Genomics of Chloroplasts and Mitochondria Edited by R. Bock. Springer, London. Rich, A., Nordheim, A., and Wang, A.H.J. 1984. The chemistry and biology of left-handed Z- DNA. Annual Review of Biochemistry 53: 791-846. Rouxel, M., Papura, D., Nogueira, M., Mchefer, V., Dezette, D., Richard-Cervera, S., Carrere, S., Mestre, P., and Delmotte, F. 2012. Microsatellite markers for characterization of native and introduced populations of Plasmopara viticola, the causal agent of grapevine downy mildew. Applied and Environmental Microbiology 78(17): 6337-6340. Saupe, S.J. 2000. Molecular genetics of heterokaryon incompatibility in filamentous ascomycetes. Microbiology and Molecular Biology Reviews 64(3): 489-502. Schaack, S., Gilbert, C., and Feschotte, C. 2010. Promiscuous DNA: horizontal transfer of transposable elements and why it matters for eukaryotic evolution. Trends in Ecology & Evolution 25(9): 537-546. Schneider, M., Grunig, C.R., Holdenrieder, O., and Sieber, T. 2009. Cryptic speciation and community structre of Herpotrichia juniperi, the causal agent of brown felt blight of conifers. Mycological Research 113(8): 887-896. Simpson, D.R., Rezanoor, H.N., Parry, D.W., and Nicholson, P. 2000. Evidence for differential host preference in Microdochium nivale var. majus and Microdochium nivale var. nivale. Plant Pathology 49(2): 261-268. Smith, J.D. 1983. Fusarium nivale (Gerlachia nivalis) from cereals and grasses - is it the same fungus? Canadian Plant Disease Survey 63(1): 25-26. Smith, J.M., Smmith, N.H., O'Rourke, M., and Spratt, B.G. 1993. How clonal are bacteria? Proceedings of the National Academy of Sciences of the U.S.A. 90: 4384-4388. Strand, M., Proolla, T.A., Liskay, R.M., and Petes, T.D. 1993. Destabilization of tracts of simple repetitive DNA in yeast by mutations affecting DNA mismatch repair. Nature 365: 274- 276.

258

Tautz, D. 1993. Notes on the definition and nomenclature of tandemly repetitive DNA sequences. In DNA Fingerprinting: state of science. Edited by S.D.J. Pena, R. Chakraborty, J.T. Epplen, and A.J. Jeffreys. Springer, Basel. Torriani, S.F.F., Stukenbrock, E.H., Brunner, P.C., McDonald, B.A., and Croll, D. 2011. Evidence for extensive recent intron transposition in closely related fungi. Current Biology 21: 2017-2022. Tronsmo, A.M., Hsiang, T., Okuyama, H., and Nakajima, T. 2001. Low temperature diseases caused by Microdochium nivale. In Low temperature plant microbe interactions under snow. Edited by D.A. Gaudet, Tronsmo, A.M., Matsumoto, N., Yoshida, M., and Nishimune, A. Hokkaido national Agricultural Experiment Station, Japan. Wang, J.C. 1979. Helical repeat of DNA in solution. Proceedings of the National Academy of Sciences of the U.S.A. 76(1): 200-203. Wang, S., Miao, X., Zhao, W., Huang, B., Fan, M., Li, Z., and Huang, Y. 2005. Genetic diversity and population structure among strains of the entomopathogenic fungus, Beauveria bassiana, as revealed by inter-simple sequence repeats (ISSR). Mycological Research 109(12): 1364-1372. Webster, J., and Weber, R.W.S. 2007. Introduction to Fungi, 3rd edition. Cambridge University Press, New York. Yap, I.V., and Nelson, R.J. 1996. Winboot: A program for performing bootstrap analysis of binary data to determine the confidence limits of UPGMA-based dendograms. In IRRI Discussion Paper Series. International Rice Research Institute, Manila, Philippines. Yu, Z., Wright, S.I., and Bureau, T.E. 2000. Mutator-like elements in Arabidopsis thaliana: structure, diversity, and evolution. Genetics 156: 2019-2031. Zhan, J., Pettway, R.E., and McDonald, B.A. 2003. The global genetic structure of the wheat pathogen Mycosphaerella graminicola is characterized by high nuclear diversity, low mitochondrial diversity, regular recombination, and gene flow. Fungal Genetics and Biology 38: 186-297. Zheng, L., Shi, F., and Hsiang, T. 2013. Genetic structure of a population of Rhizoctonia solani AG 2-2 IIIB from Agrostis stolonifera revealed by inter-simple sequence repeat (ISSR) markers. Canadian Journal of Plant Pathology.

259

Table 4.1 Year and location of collection from the Guelph Turfgrass Institute for all samples included in multi-year screening. See Figure 4.3 for a map depicting these locations.

Collection location and grass species Year Near Pathology Green (PG) Near Roadside (Rd) 2011 6 11 2012 25 21 2013 43 31

260

Table 4.2 List of all SSR and ISSR primers screened to assess genetic variation in

Microdochium nivale field isolates collected across three years.

Primer Primer Name Primer Sequence(s) Primer Source Type (ACC)6CC ACCACCACCACCACCACCCC (Baysal et al. 2010) (Bayraktar et al. (AG) AGAGAGAGAGAGAGAG 8 2008) (CT)5RG CTCTCTCTCTRG Hsiang Lab CT(GA)8 CTGAGAGAGAGAGAGAGA (Dinolfo et al. 2010) (Bornet and (CCA) CCACCACCACCACCA 5 Branchard 2001) ISSR (GA)8T GAGAGAGAGAGAGAGAT Hsiang Lab (GA)6GG GAGAGAGAGAGAGG Hsiang Lab (Schneider et al. BHY(AGC) BHYAGCAGCAGCAGCAGC 5 2009) CCA(TGA)5TG CCATGATGATGATGATGATG (Baysal et al. 2010) (ACC)6CC ACCACCACCACCACCACCCC (Baysal et al. 2010) DD(CCA)5 DDCCACCACCACCACCA (Hantula et al. 1996) (CAC)5 CACCACCACCACCAC (Dinolfo et al. 2010) TGCAGGGACTCATCGACC MnSSR_(GC)5 F/R TCATCTCCGCCACACTCC GCCGCGAGGTCACTACAG MnSSR_(CT)5 F/R CCCGGGAAGAGGAAGTTG CCAACCTCGAGGCAGACA MnSSR_(AAC)5 F/R CATCGCTTCCGTTGCTGT CAACATTGACGCCTATCGC MnSSR_(CT)7 F/R CGTCGTGGACCTCCTTTG ACAAGACCGATGACGATGAC SSR MnSSR_(GAT)6 F /R This study GCGAGCGAGGTGTACAAA GTGGAACTTGAGCCGCAC MnSSR_(CG)5 F/R GGCTCGATCCCGAAGC CACCAAGCAAAGCGAGGA MnSSR_916061 F/R CTGACTCGAGCCCGCATA GTTGGAGACTGACGGCGA MnSSR_908729 F/R TGGTTGTCGGAGCCTGAG TCCATCACGGAACGGG MnSSR_920718 F/R GCCCTTTCACTTCACAGCA GCGGTGGTTCCGTAAGTG MnSSR_923044 F/R GGTAAAGCGCTTGGCAGA CGGATCTCCTTCTCCCAGAT MnSSR_927245 F/R GAGATGCGCAAGTACCGC

AACTCGCATGGCTGCCT MnSSR_920478 F/R GAGGTCACGGAGGCTCG 261

Table 4.3 List of all SSR and ISSR primers selected for analysis of genetic variation in

Microdochium nivale field isolates collected across three years.

Number of Annealing Primer Type Primer Name Polymorphic Bands Temperature (°C) BHY(AGC)5 12 ISSR (GA)8T 20 45 DD(CCA)5 19 MnSSR_(GC)5 F/R 20 MnSSR_(CT)5 F/R 23 58 SSR MnSSR_(GAT)6 F /R 10 MnSSR_916061 F/R 16 55 MnSSR_920478 F/R 17

262

Table 4.4 Results of linkage disequilibrium calculations performed using the program Disequil

(described in Mahuku et al. 1998) on isolates of Microdochium nivale collected in three separate years and in two locations at the Guelph Turfgrass Institute. See Figure 4.3 for a map depicting these locations.

Collection Collection Observed Expected Index of Year Location* variance variance association PG 15.067 6.240 1.41 2011 Rd 32.187 8.0339 3.01 PG 112.371 15.746 6.14 2012 Rd 137.890 15.604 7.84 PG 121.458 15.597 6.79 2013 Rd 136.514 19.432 6.03 * PG = near pathology green; Rd = along roadside

263

SSR fwd ISSR primer primer 3’ 5’ CAGTGATTGCTTAGTCTACTGT GAGAGAGAGAGAGAGA TACCCTGGATCGGATTGA CTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCT

GTCACTAACGAATCAGATGACA CTCTCTCTCTCTCTCT ATGGGACCTAGCCTAACT GAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGA

SSR rvs ISSR primer primer

Figure 4.1 Diagram depicting the relative positions of hypothetical SSR and ISSR loci and the primers that could amplify these regions. The SSR loci (white text on black background) are flanked by ISSR regions (black text on white background). Whereas the ISSR primer (white text on black arrow) is anchored within the repetitive SSR region, and thus can be designed with only knowledge about the repetitive SSR sequence, the SSR primers (black text on white arrows) are located within the non-repetitive ISSR sequences, and thus require more detailed genomic information.

264

Figure 4.2 Grasses displaying symptoms of pink snow mold and / or Fusarium patch. Both photos were taken at the GTI, Guelph, Ontario, 2 March 2012. a) Kentucky bluegrass in the area to the east of the native green and b) annual bluegrass / creeping bentgrass mixture, native green.

265

N

B

A

20 m

Figure 4.3 Map of the Guelph Turfgrass Institute with collection locations and key landmarks indicated: A: Near pathology green (PG); B: roadside (Rd). The grass species at both collection locations was Poa pratensis (Kentucky bluegrass).

266

1 2 50

39 1

2 5 6 5 20 6 1 32

2

14 3 58

4 3 2

Figure 4.4 UPGMA tree depicting relationships between M. nivale isolates collected from P. pratensis at two locations (Figure 4.3) yearly from 2011-2013. Bootstrap values (out of 100) are displayed on key nodes. Legend: 1: 2013 pathology green isolates; 2: 2013 roadside; 3: 2012 pathology green; 4: 2012 roadside; 5: 2011 pathology green; 6: 2011 roadside.

267

Appendices for Chapter 4

Appendix 4.1Sample input for linkage disequilibrium calculation with Arlequin

[Profile] NbSamples=6 DataType=RFLP # - {DNA, RFLP, MICROSAT, STANDARD, FREQUENCY} GenotypicData=0 # - {0, 1} GameticPhase=1 # - {0, 1} LocusSeparator=TAB # - {TAB, WHITESPACE, NONE} RecessiveData=0 # - {0, 1} MissingData='?' # A single character specifying missing data # Some advanced settings the experienced user can uncomment # Frequency= ABS # - {ABS, REL} # FrequencyThreshold= 1.0e-5 # - (Any real number, usually between 1.0e-7 and 1.e-3) # EpsilonValue= 1.0e-7 # - (Any real number, usually between 1.0e-12 and 1.0e-5)

[Data]

[[HaplotypeDefinition]] HaplListName="List of observed haplotypes" HaplList = { 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... }

[[Samples]]

#1 268

SampleName="2013_pg" SampleSize= 43 SampleData= { 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 1 10 1 11 1 12 1 13 1 14 1 15 1 16 1 17 1 18 1 19 1 20 1 21 1 22 1 23 1 24 1 25 1 26 1 27 1 28 1 29 1 30 1 31 1 32 1 33 1 34 1 35 1 36 1 37 1 38 1 39 1 40 1 41 1 42 1 43 1 }

#2 SampleName="2013_rd" SampleSize= 31 SampleData= { 44 1 45 1 46 1 269

47 1 48 1 49 1 50 1 51 1 52 1 53 1 54 1 55 1 56 1 57 1 58 1 59 1 60 1 61 1 62 1 63 1 64 1 65 1 66 1 67 1 68 1 69 1 70 1 71 1 72 1 73 1 74 1 }

#3 SampleName="2012_pg" SampleSize=25 SampleData= { 75 1 76 1 77 1 78 1 79 1 80 1 81 1 82 1 83 1 84 1 85 1 86 1 87 1 88 1 89 1 90 1 91 1 92 1 93 1 94 1 270

95 1 96 1 97 1 98 1 99 1 }

#4 SampleName="2012_rd" SampleSize=17 SampleData= { 100 1 101 1 102 1 103 1 104 1 105 1 106 1 107 1 108 1 109 1 110 1 111 1 112 1 113 1 114 1 115 1 116 1 }

#5 SampleName="2011_pg" SampleSize=5 SampleData= { 117 1 118 1 119 1 120 1 121 1 }

#6 SampleName="2011_rd" SampleSize=9 SampleData= { 122 1 123 1 124 1 125 1 126 1 127 1 128 1 129 1 130 1 } 271

[[Structure]] StructureName="by year" NbGroups=3 #1 Group={ "2013_pg" "2013_rd" }

#2 Group={ "2012_pg" "2012_rd" }

#3 Group={ "2011_pg" "2011_rd" }

272

Appendix 4.2 Banding patterns detected with primer groups studied (n= 136). Multiple rows with the same number of bands represent different banding patterns based on band sizes.

Repetitive banding Number of patterns isolates with a Number of Primer Name Band sizes observed unique banding Number isolates pattern of bands displaying pattern 0 5 1 10 2 6 2 30 3,000; 1,300; 1,150; 3 3 1,000; 850; 700; 525; 3 10 MnSSR_916061 475; 450; 425; 400; 325; 22 4 5 F/R 300; 250; 170 5 13 6 3 6 5 6 5 7 4 7 10 0 1 3 11 86 4 4 MnSSR_920478 5,000; 2,000; 1,300; 4 5 F/R 1,000; 900; 800; 725; 4 6 675; 550; 475; 375; 350; 5 3 300; 275; 250; 200; 150; 6 3 75 7 4 7 4

8 4 0 5 2,500; 1,400; 1,300; 3 4 1,200; 1,000; 950; 900; 3 12 MnSSR_(CT)5 800; 750; 625; 600; 525; 91 4 3 F/R 450; 375; 300; 260; 225; 4 7 200; 160; 100; 90 5 6 7 3 MnSSR_(GAT)6 1,300; 1,250; 1,050; 900; 0 23 40 F /R 775; 500; 375; 300; 275; 1 7 273

200 1 28 2 5 2 14 3 3 3 7

4 4 0 7 1 3 1 26 5,000; 1,500; 1,100; 2 7 1,050; 1,000; 900; 800; 2 16 MnSSR_(GC)5 700; 650; 500; 475; 425; 40 4 3 F/R 380; 325; 300; 275; 225; 4 3 175; 150; 75 5 4 5 5 5 6 7 11 0 1 1 7 4 3 2,200; 1,900; 1,600; 4 4 1,500; 1,300; 1,200; 80 4 5 DD(CCA)5 1,075; 1,050; 1,000; 950; 5 3 900; 860; 850; 730; 650; 5 3 575; 500; 450; 400 5 5 5 6 5 11

7 3 2,400; 2,200; 1,700; 0 4 1,600; 1,400; 1,200; 2 4 (GA)8T 1,100; 1,000; 950; 875; 113 2 4 800; 750; 650; 600; 550; 3 3 510; 480; 400; 310; 260 3 3 0 1 2,100; 2,000; 1,800; 4 3 BHY(AGC)5 1,100; 1,050; 900; 800; 104 4 7 700; 620; 550; 480; 400 4 9 8 7

274

Chapter 5 Mating Type Experiments

5.1 Introduction

5.1.1 Reproduction in Fungi As a group, fungi, like plants, are capable of both sexual and asexual reproduction.

Asexual propagation may occur through the dispersal of mitotically produced spores (conidia) or hyphal fragments, whereas sexual reproduction involves the formation of meiospores (Taylor et al. 1999). Many species of fungi are capable of both mechanisms of reproduction, and may employ both under different conditions (e.g. based on environmental factors, such as temperature or the availability of nutrients). Historically, fungi were classified based on the morphology of their sexual structures (Webster and Weber 2007); however, the sexual cycle of some fungi is completely unknown. In the past, apparently asexual species were placed in the Deuteromycetes, but this taxonomic class has been abandoned in more recent classification systems by re- organizing taxa based on evolutionary relationships established using other tools, such as DNA sequencing data (Hibbett et al. 2007; Schoch et al. 2009). In addition, molecular techniques have facilitated the identification of cryptic sexual reproduction in populations previously believed to be strictly asexual (Arie et al. 2000).

Among species that do reproduce sexually, the mechanisms and requirements of sexual reproduction vary widely. In the Dikarya, which are composed of the Ascomycota and the

Basidiomycota, mating is apparently under the control of the genes at a single locus (Butler

2007). In heterothallic Ascomycota, the mating type locus, usually abbreviated as MAT1

(Turgeon and Yoder 2000), contains one of two possible genes (or gene families). While both

MAT genes in heterothallic species occupy the same genetic region (i.e. the MAT1 locus) in

275

different individual isolates, they are referred to as idiomorphs rather than alleles to emphasize their extremely divergent nature (Metzenberg and Glass 1990) Collectively, both genes are referred to as mating type (MAT) genes, and sexual reproduction in heterothallic Ascomycota requires plasmogamy between two individuals that each carry a different idiomorph of the MAT genes.

Although individual isolates of heterothallic Ascomycota possess only a single mating type gene and require an individual with the opposite MAT gene for meiosis to occur, other modes of sexual reproduction are possible. Homothallic species contain both mating type genes within single nuclei, and individual strains (i.e. single-spore- or single-nucleus-derived cultures) are self-fertile and usually do not outcross with any other individual. More complex mating systems also exist, including pseudohomothallism, wherein different nuclei are found in the meiospore, each possessing one of the two MAT genes (Merino et al. 1996). Most ascospores produced in pseudohomothallic species possess both nuclei (and thus contain both MAT genes), but, rarely, individual ascospores may contain only a single nucleus. The cultures derived from these monokaryotic spores are self-sterile but are fertile with cultures derived from either a monokaryotic spore of the opposite mating type, or with cultures containing both nuclei as in the parent tissue (Merino et al. 1996).

Most filamentous Ascomycete species studied may be readily categorized as asexual, homothallic, heterothallic, or pseudohomothallic based on their mating behaviour. Genetic studies examining the MAT genes have generally supported the non-taxonomic groupings based on these observations (e.g. (Yun et al. 2000)), with the intriguing exception that putative MAT genes have been identified in the genomes of apparently asexual species (e.g. (Arie et al. 2000;

Yun et al. 2000)). However, unusual mating systems that do not agree with these categories are 276

known. For example, in the heterothallic plant pathogen Ophiostoma quercus, isolates of both mating types are reported to contain both MAT idiomorphs (Wilken et al. 2012).

In a more complex variation from the simple categories described above, several species within the genus Glomerella have displayed unusual mating patterns which do not seem to be solely dependent upon the two "canonical" MAT genes in the Ascomycota (Butler 2007). For example, in G. lindemuthiana, one cross between self-sterile strains produced asci and ascospores in culture, yet both parental strains apparently contained a single MAT gene

(Rodriguez-Guerra et al. 2005). In G. graminicola, both self-sterile and homothallic strains have been identified (Vaillancourt et al. 2000), but only a single MAT gene was identified in all isolates tested by PCR (Chen et al. 2002). The single MAT gene that has been amplified in several species of Glomerella is a single idiomorph; the opposite idiomorph has not been successfully amplified by PCR (Menat et al. 2012; Rodriguez-Guerra et al. 2005; Vaillancourt et al. 2000) nor identified from the genome sequences available to date (Menat et al. 2012).

Unbalanced heterothallism, wherein individual strains of a species contain mutated fertility genes that may either be compatible or incompatible, was proposed by Wheeler (1954) to explain his observations with this genus; however, molecular evidence for this hypothesis has not been found (Menat et al. 2012). Similarly, among the well-studied genus Neurospora, only a single

MAT gene has been detected among the species N. africana, N. lineolata, and N. galapagoensis, despite their apparent homothallism (Lin and Heitman 2007). Together, these observations suggest that the relatively simple model of fungal mating as solely under the control of the

MAT1 locus may be incomplete for at least some taxa of Sordariomycetes.

277

5.1.2 Genes associated with sexual reproduction The current model of sexual reproduction in the Ascomycota states that, as a group, these fungi possess two different mating types, usually referred to as + and –, A or a, or 1 and 2

(Casselton 2008). The names MAT1-1 and MAT1-2 have been suggested as standardized names for these two idiomorphs (Turgeon and Yoder 2000). Each of the two idiomorphs may include one or more genes (Turgeon and Yoder 2000). The MAT genes appear to code for transcription factors, which in turn regulate the transcription of genes pertaining to the synthesis of hormones, proteins and compounds directly relevant to sexual reproduction (Pöggeler 2000). The two MAT idiomorphs may be identified by their DNA-binding motifs: whereas MAT1-1 contains an α-box motif, MAT1-2 contains an HMG (high-motility group) box (Arie et al. 1999). This observation facilitates the discovery of mating type genes in new species, because the remainder of the sequence is often highly divergent (Taylor et al. 1999).

Within the MAT1-1 idiomorph, the α-box-containing gene is usually named MAT1-1-1

(Turgeon and Yoder 2000), and may be associated with two additional genes, called MAT1-1-2 and MAT1-1-3. Although all three genes appear to play critical roles in sexual reproduction when present (Klix et al. 2010), mutant strains of Neurospora crassa carrying MAT1-1-1 but with deletions of both MAT1-1-2 and MAT1-1-3 are capable of successful, albeit reduced reproduction (Ferreira et al. 1998). Alpha boxes are a general family of transcription regulators that are found in association with a wide variety of other proteins (Günther et al. 1998). In contrast, the MAT1-2 idiomorph consists of a high-motility group (HMG) protein, usually termed MAT1-2-1 (Turgeon and Yoder 2000) and may include a second gene, MAT1-2-2

(Pöggeler and Kuck 2000; Staben and Yanofsky 1990). The HMG box is a common motif in the

278

DNA-binding domain of diverse proteins, including transcription factors (Bianchi and Agresti

2005) and the mammalian sex-determining protein SRY (Dubin and Ostrer 1994).

Despite containing highly conserved amino acid motifs (Coppin et al. 1997), the overall sequences of the MAT genes are poorly conserved overall (Yun et al. 2000). Further complicating the assignment of mating type using molecular methods, both the MAT1-1-3 and

MAT1-2-1 always contain an HMG box (Butler 2007; Yun et al. 2000). The high conservation of the HMG and / or the α-box portion(s) and the lack of conservation of the remaining portion of the gene may allow inadvertent assignment of the incorrect mating type to an isolate of interest. However, this error can be avoided because, to date, MAT1-1-2 and MAT1-1-3 have always been found in association with MAT1-1-1 (Lin and Heitman 2007), whereas MAT1-2-1 may be found either with an α-box (in the case of homothallic strains with MAT1-1-1) or without (such as in heterothallic strains).

Because of the poorly conserved nature of the nucleotide sequences of these mating type genes between species, the identification of MAT gene sequences without genomic information can be challenging. However, the identification of the MAT1 locus is facilitated in the

Pezizomycotina because the MAT gene(s) are frequently located in between two protein-coding genes: APN2 (encoding a DNA lyase) and SLA2 (encoding a protein that plays a role in cytoskeletal assembly) (Butler 2007). However, most of the species in which this pattern has been studied belong to the Sordariomycetes, although some unpublished research from this lab

(Y. Deng & T. Hsiang, personal communication) has found that this synteny is less prevalent among the Dothideomycetes; therefore, there may be class-level differences in the conservation of this organizational pattern.

279

Even in homothallic species that have MAT idiomorphs located in two entirely separate locations within the genome, a partial sequence of one of these usually-adjacent proteins may be located proximally to each MAT1 locus (Rydholm et al. 2007). The most common orientation is for the SLA2 sequence to be upstream of the MAT gene(s), with APN2 located immediately downstream (Butler 2007). In addition to SLA2 and APN2, there are other genes frequently associated with the MAT region in the Sordariomycetes. Two additional proteins, an anaphase- promoting complex protein (APC5), and cytochrome c oxidase (COX13), are frequently identified within the vicinity of the MAT1 locus (Butler 2007). For simplicity, the genes typically located immediately up- and down-stream of the MAT1 locus in other Sordariomycetes

(SLA2, APN2, APC5, and COX13) will be collectively referred to herein as "flanking genes."

5.1.3 Sexual reproduction in M. nivale and M. majus The sexual stage of M. nivale sensu lato has been described, but some reports of its frequency are inconsistent. Where sexual reproduction has been described, M. nivale sensu lato has been reported to form perithecia containing numerous asci 75 µm in length and 7-8 µm in width (Stevens 1918), containing eight ascospores. The ascospores are hyaline and ellipsoidal, with 0-3 septa and are 10-17 µm in length and 3.5-4.5 µm in width (Booth 1971). When the teleomorph of M. nivale sensu lato was first described as Calonectria nivalis, Schaffnit described the conidia of the corresponding asexual stage as ranging in length between 14.6 to 25. 2 µm in length and between 3.2 to 4.3 µm in width, with most conida possessing three septa (1913). The identification of C. nivalis from several cereal hosts, as well as this description of conidial size, does not clearly establish whether the anamorph was M. nivale or M. majus (Glynn et al. 2005); however, the predominance of tri-septate conidia is more similar to M. majus than to M. nivale.

280

Bennett (1933) observed the formation of immature perithecia only on sterilized plant fragments in the lab, as opposed to plant tissue in the field, and only when freshly-isolated fungal strains were studied. In contrast, Wollenweber reported that approximately 10% of isolates produced perithecia in the lab and that those strains that did form sexual structures produced fertile perithecia in abundance on different substrates (as cited in Bennett 1933). Gordon (1952) reported that single conidial cultures were self-fertile (i.e. that M. nivale sensu lato is homothallic), and that perithecia were not present in the field in Canada; however, mature perithecia were observed on wheat in North America in 1966 (Cook and Bruehl). The sexual stage of M. nivale sensu lato has been frequently reported in Europe (Cook and Bruehl 1966).

In more recent experiments, which distinguished between M. nivale and M. majus (then considered conspecific varieties), both species were found to produce fertile perithecia in the lab

(Lees et al. 1995; Litschko and Burpee 1987; Parry et al. 1995). However, M. majus appears to produce perithecia more readily than M. nivale (Lees et al. 1995). Single-spore isolates of both species produce perithecia in the lab (Lees et al. 1995; Parry et al. 1995), and thus both species have been described as potentially homothallic. A further distinction was observed among M. nivale isolated from either cereal or turfgrass when isolates from cereal, but not turfgrass, produced perithecia in the lab (Smith 1983); however, fertile perithecia were obtained from paired M. nivale turfgrass isolates in a later experiment (Litschko and Burpee 1987). In this experiment, pairings of isolates collected from both cereal and turfgrass isolates (and cross- pairings of isolates collected from each host) yielded fertile perithecia (Litschko and Burpee

1987). This observation, in addition to the high degree of genetic variation in M. nivale relative to M. majus, has led some to suggest that M. nivale may also reproduce heterothallically (Lees et al. 1995; Mahuku et al. 1998). Neither the MAT nor the flanking genes from M. nivale and M. 281

majus have been described, but the identification of these genes within this species may help to clarify some of the confusion regarding whether M. nivale is truly homo- or heterothallic.

5.1.4 Objectives The primary objective of this project was to identify mating-type genes within multiple genomes of M. nivale and M. majus to assess whether these species are homo- or heterothallic, and whether they undergo sexual reproduction. The presence of mating type genes in the M. nivale and M. majus isolate collections was first assessed using primers developed for other fungal species, and then from whole-genome sequenced isolates of Microdochium nivale and M. majus. Mating crosses were also performed in the lab to investigate possible cross-fertility within and between species.

5.2 Materials and Methods

5.2.1 Test of mating type primers based on conserved sequences Published MAT1-1-1 and MAT1-2-1 primers (Table 5.1) were tested with DNA from nine isolates of M. nivale and two isolates of M. majus (Table 5.2). Each reaction was repeated twice, using annealing temperatures of 55 and 53 °C, and using a final concentration of 0.4 mM of Mg 2+ in each reaction mixture. The thermocycling program was as described for the ITS amplification (Section 2.2.4).

Mating-type primers for MAT1-1 and MAT1-2 were designed by manually selecting conserved regions in the alignment of nucleotide sequences collected from GenBank (Table 5.3), and aligning and visualizing the sequences as described (Section 2.2.6). The primers that were designed in this manner (Table 5.4) were tested using the isolates summarized in Table 5.1. All primers were tested using the PCR protocol and thermocycling conditions described for the ITS

282

amplification (Section 2.2.4), with the exception that annealing temperatures between 50-62 °C were tested for the primer sets. The amplicons from seven isolates amplified with primers designed in this manner were sequenced and analysed in the manner described in Chapter 2

(Section 2.2.4) .

5.2.2 Identification of putative MAT1 loci and flanking genes and screening of isolate collection The amino acid sequences of MAT1-1-1, MAT1-1-2 MAT1-1-3, MAT1-2-1, and the flanking genes SLA2 and APN2 from several species of filamentous ascomycetes (Table 5.2) were used to query both the assembled scaffolds and the predicted gene sequences from the genomes of M. majus isolate 99049 and M. nivale isolates 11037 and 12262 (Chapter 3). Primers were designed for the putative sequences of these genes by selecting regions amenable to primer design (Section 2.2.6).

Four primers, Mn_MAT2_3347F, Mn_MAT2_3871R, Mn_lyase_838F, and

Mn_SLA2_23R were designed based on an early assembly of the M. majus 99049 genome, which was the first Microdochium sp. genome available (Chapter 3). The MAT1-2-1 primers

Mn_MAT2_3347F and Mn_MAT2_3871R were tested using DNA from both M. nivale and M. majus (including the sequenced isolate) using the PCR conditions described for the ITS amplification (Section 2.2.4). The Mn_lyase_838F and Mn_SLA2_23R primer set was first tested using DNA from the sequenced isolate and using the PCR mixture described for the RPB2 amplification (Section 2.2.4), with a thermocycling protocol consisting of an initial 5 minute denaturation at 95 °C followed by 35 cycles of a 1 minute denaturation at 95 °C, a 1 minute annealing period at 55 °C, and a 3 minute extension at 72 °C. The final extension consisted of a

283

10 minute period at 72 °C. This protocol was modified by increasing the annealing period to 2 minutes and the extension period in 1-minute increments up to 6 minutes in length.

This reaction was also attempted using the Expand LT (Expand Long Template PCR system; Roche, Laval, Quebec) enzyme mixture. The reaction was performed using buffer mixture #2 (containing 27.5 mM of Mg2+) and according to the manufacturer's directions with the exception that a final concentration of 500 nM, rather than 300 nM, of each primer was included. The basic thermal cycling procedure consisted of an initial denaturation of 2 minutes at

94 °C, followed by 10 cycles of a 10 s denaturation at 94 °C, a 30 s annealing period at 50 °C, and an 8 minute extension period at 68 °C. This was followed by 20 cycles with the same parameters, except that the duration of the extension period was increased by 20 s each cycle.

These cycles were followed by a final extension at 68 °C for 7 minutes. This cycle was modified by varying the annealing temperature up to 63 °C and by increasing the extension temperature to

70 °C to reduce non-specific amplification.

Later, based on a more complete assembly of the M. majus 99049 genome (Chapter 3), the primer Mn_SLA2_1156F was designed. This primer was paired with the Mn_lyase_838F primer using the Expand LT amplification conditions described above. In addition to these reactions, the Mn_SLA2_1156F primer was also paired with the Mn_MAT2_3871R primer and the Mn_MAT2_3347F primer was paired with the lyase_838R primer to confirm the synteny of this region. These reactions were performed using the Expand LT conditions described above.

The alignments for each set of primers designed are found in Appendices 5.1 through 5.4, and a list of the primers ordered and tested is found in Table 5.4. All M. majus isolates available in the local lab collection were screened with the primer set Mn_MAT2_3347F and

Mn_MAT2_3871R to determine whether the putative MAT1-2-1 gene identified in M. majus 284

isolate 99049 could be found in all isolates. The region spanning between the SLA2 to APN2 genes was also tested in all available isolates.

When the M. nivale 11037 genome became available, the primer set Mn_MAT2_20F and

Mn_MAT2_727R was designed to amplify a partial sequence of the MAT1-2-1 gene. These primers were tested using the PCR conditions described for the amplification of ITS (Section

2.2.4). The primer set Mn_SLA2_1156F and Mn_APN2_700R was designed to amplify a region spanning from the end of the putative SLA2 gene to the beginning of the putative APN2. In addition, the primers Mn_SLA2_1156F and Mn_MAT2_727R, and separately, Mn_MAT2_20F and Mn_APN2_700R were paired to test the synteny of the region surrounding the putative

MAT1 locus. All reactions involving either the APN2 or the SLA2 primers were performed using the Expand LT enzyme as described above. To determine whether the putative MAT1-2-1 gene identified in M. nivale isolate 11037 could be amplified in all M. nivale isolates, a variety of

M. nivale isolates from different geographic and host origins were screened with the primer set

Mn_MAT2_20F and Mn_MAT2_727R (Table 5.8).

The MAT sequence data from M. nivale isolate 12262 was combined with the data from the previously-sequenced M. nivale 11037 and M. majus 99049, and was used to design the primer sets Mic_MAT2_198F and Mic_MAT2_676R and Mic_SLA2_92F and Mic_APN2_32R that were designed to amplify the putative MAT1-2-1 and the region between SLA2 and APN2.

The amplification protocols for these primer sets were as described for the equivalent reactions performed using the primers designed with the data from M. nivale isolate 11037. These primer sets were used to amplify the region spanning from the end of the putative SLA2 through to the beginning of APN2, and to amplify the putative MAT1-2-1 region in DNA isolated from both M. nivale and M. majus. 285

5.2.3 Mating experiments Wheat straw was collected from the dairy barn at the University of Guelph. The straw was cut into segments 2-3 cm long, and these fragments were autoclaved three times at 121 °C for 20 minutes. Following sterilization, the wheat straw was placed into the centre of a Petri plate containing either PDA (experiment #1) or water agar (experiment #2). Water agar was prepared by combining 20 g of Bacto agar (Becton, Dickinson and Company, MD, USA) with 1 L of water and autoclaving for 20 minutes at 121 °C. The straw was inoculated with two plugs of agar cut from the actively-growing margin of a colony. A list of the isolates used and the crosses performed in these experiments is found in Table 5.9. The agar plugs were placed at opposite ends of the straw, and the plates were sealed with Parafilm and incubated in the dark in loosely- sealed plastic storage bins (KIS Omni Box, 9L, Milton, ON).) In experiment #1, the plates were stored at 10 °C, and in experiment #2, replicate plates were incubated at 5, 10, 15, and 20 °C.

Plates were checked weekly for the presence of perithecia.

Where perithecia were present, a single perithecium was gently removed from the surface of the wheat straw and was placed in 100 µL of sterile dH2O. The perithecium was vortexed for

15 s, and then allowed to settle at room temperature for 15 minutes. An aliquot of 50 µL was then viewed under 400 x magnification to check for the presence of ascospores. Ascospores of

Microdochium spp. are shorter and thinner than conidia (Booth 1971). The average size and number of septa from ten ascospores was recorded for each of the isolates producing ascospores.

When putative ascospores were observed, the spore suspension was streaked on a PDA plate and incubated at room temperature for 24 h. After 24 h, single germinated ascospores were identified at 100 x magnification and were transferred to fresh PDA plates. Mycelium was harvested from ten single-ascospore cultures in the manner described in Chapter 2.

286

To ensure that ascospores were obtained and not conidia, their size was compared to those of conidia. The average dimensions (length, width, and number of septa) were recorded for ten conidia each for eight mono-conidial colonies of M. majus. Conidiation was induced by incubating inoculated PDA plates overlaid with cellophane under constant light for up to one week at room temperature. Conidia were collected by pipeting 2 mL of sterile dH2O onto the surface of the plate, scraping gently with a sterile glass rod, and viewing the resulting conidial suspenstion under 400x magnification.

To investigate the similarity of the genotypes of the single-ascospore (SA) cultures to the

"parent" culture and to the other "sibling" cultures, the DNA isolated as described above was assessed in two ways. First, the presence or absence of the putative MAT1-2-1 gene identified as described above was assessed using the primers Mic_MAT2_198F and Mic_MAT2_676R. The genetic diversity was also assessed using the ISSR and SSR primers listed in Table 4.3 using the

PCR protocol described in Chapter 4. The banding patterns obtained for the ten SA cultures and the parent culture were also compared to those for an additional seven M. majus isolates. The resulting binary matrix was used to construct a bootstrapped UPGMA tree (Section 4.2.4).

5.2.4 Comparison of Microdochium sp. with other species The synteny of the MAT region in several fungal species were compared to that observed in the genomes of M. majus isolates 99049 and 10095, M. nivale isolates 11037, 12262, and

10106, and M. bolleyi isolate 07020. The genomes used in these comparisons were all members of Xylariales including Daldinia eschscholtzii (downloaded from JGI), Annulohypoxylon stygium

(courtesy of B. Xie, Fujian Agriculture and Forestry University), two species of Hypoxylon,

Pestalotiopsis neglecta (courtesy of K. Watanabe, Tamagawa University), and Pestalotiopsis sp.

287

(courtesy of K. Watanabe, Tamagawa University). The putative genes for APN2, SLA2, APC5,

COX13, and, where present, MAT1-1-1, MAT1-1-2, MAT1-1-3, and MAT1-2-1 were identified by performing BLASTx searches using published protein sequences for these genes (Table 5.2) against BLAST databases of the assembled sequences of each genome. A map of the MAT region was constructed for each of the genomes studied. For the Microdochium spp., the putative gene sequences identified were also searched against the predicted gene sets for each species.

For those genomes where putative MAT genes were found in locations distant to the typical flanking genes, the regions 10,000 bp both up- and down-stream of the putative mating type genes were examined to determine whether there were any other putative genes that were common to these loci. The genes found within these regions were tentatively identified in the other genomes tested to search for commonalities in their location relative to the other genes of interest.

5.3 Results

5.3.1 Test of published mating type primers and redesigned universal primers Primers designed to amplify the mating type genes MAT1-1-1 and MAT1-2-1 from several different species of Ascomycetes were tested on 11 isolates of M. nivale and M. majus to identify the presence of these genes in these species. Using the published MAT1-2-1 primers

ChHMG1 and ChHMG2, a faint band of the predicted size (approximately 300 bp) was produced from three M. nivale isolates. The MAT1-1-1 Falpha1 and Falpha2 primer set also amplified several weak bands (approximately 2,000, 1,000, and 300 bp in length) from both the M. nivale and the M. majus isolates. Despite optimization attempts, a single band was not obtained from either of these primer sets.

288

With the goal of producing a single band in sufficient quantities for sequencing, new

MAT1-1-1 and MAT1-2-1 primers were designed from conserved regions of MAT sequences downloaded from GenBank (Table 5.3). Despite testing all of the primers described in all possible combinations (i.e. all forward primers for a single gene were tested with all of the reverse primers available), putative amplicons of neither MAT1-1-1 nor MAT1-2-1could be amplified from either species of interest using previously-published data.

5.3.2 Identification of putative mating type and flanking genes The sequencing data of M. majus isolate 99049 provided the first Microdochium genome obtained (May 2011). When the raw data from this genome were assembled (Chapter 3), an attempt was made to identify putative sequences for the MAT1-1-1 or the MAT1-2-1 genes using sequences collected from GenBank (Table 5.6). In the early assembly of the M. majus genome available at that time, putative matches for MAT1-2-1, SLA2, and APN2 were identified on separate contigs. A putative match for MAT1-1-1 was not found. More details regarding the identification of the flanking genes in this isolate are found in Appendix 5.11.

A set of primers, Mn_MAT2_3347F and Mn_MAT2_3871R was designed to amplify the putative MAT1-2-1 sequence. The MAT1-2-1 primers designed with the Mm99049 genome successfully amplified a single band of the predicted size with all of the M. majus isolates tested, but failed to yield an amplicon from any of the M. nivale isolates tested. When the amplicon of

M. majus isolate 10148 was sequenced in the forward direction, the resulting sequence was

100% identical to the putative MAT1-2-1 sequence from the genome data across the 441 bp sequence obtained. None of the M. nivale isolates tested could be amplified using these primers.

289

A second set of MAT1-2-1 primers, Mn_MAT2_3356F and Mn_MAT2_3979R were prepared based on the putative MAT1-2-1 sequence identified in the genome. All of the M. majus isolates tested with this primer set produced a single band of the predicted size (623 bp), but, again, none of the M. nivale isolates tested showed a band with this primer pair.

The flanking genes SLA2 and APN2 were tentatively identified in the M. majus genome by querying sequences collected from GenBank (Table 5.6) against the early assembly of the M. majus genome. These genes were identified based on their high sequence identity with the query sequences (Table 5.7). However, in this early assembly, each of these genes and the putative

MAT1-2-1 gene were found on separate contigs, and as a result the synteny of this region that was observed in other Sordariomycetes (Figure 5.1) could not be confirmed using the genomic data alone.

When an improved assembly of the M. majus genome became available (December

2011), the putative SLA2, APN2, and MAT1-2-1 genes previously identified were found to be contiguous. However, these genes were arranged in a different configuration from that observed for many other Sordariomycetes (Figure 5.1). This new information was used to design a new primer, Mn_SLA2_1156F which was paired with the Mn_APN1_838F primer which, serendipitously, functioned as a reverse primer as the putative lyase gene was downstream of

SLA2 (rather than upstream, as predicted from other species), and possessed an inverted orientation relative to other species. This combination of primers yielded a single band of the predicted size (8 kb) for all of the M. majus isolates with which it was tested. The orientation of the MAT1 region and the flanking genes as proposed based on the whole-genome assembly of

M. majus isolate 99049 was confirmed by amplifying the region from the tail end of the SLA2 gene to the MAT1-2-1 gene (using primers Mn_SLA2_1156F and Mn_MAT2_3871R) and the 290

region from the MAT1-2-1 gene to the beginning of the APN2 gene (using primers

Mn_MAT2_3347F and lyase_838F). Both reactions yielded single bands of the predicted sizes

(4.5 kb and 2.3 kb) for all of the M. majus isolates tested, regardless of their geographic origin.

The combination of primers described above that successfully amplified the APN2 through the SLA2 cassette as well as the MAT1-2-1 gene and intergenic regions were applied to

DNA from M. nivale isolates from grass and wheat, and from both Europe and Canada.

However, these primers failed to amplify the genes of interest under any of the conditions attempted. All of the putative MAT and flanking genes identified in M. majus were also found among the genes predicted using AUGUSTUS (Chapter 3). A summary of the similarity of these predicted genes to the best match among the sequences used to query against these sequences is found in Table 5.7.

When the genome of M. nivale isolate 11037 became available (February 2012), the putative MAT1-2-1, APN2, and SLA2 genes were identified in both the genome assembly and in the predicted gene set by using both the sequences listed in Table 5.6 and the putative MAT1 locus and flanking genes identified in M. majus isolate 99049. A summary of the similarity of these predicted genes to the best match among the sequences used to query against these sequences is found in Table 5.7. A description of the similarity of these sequences to the putative orthologs in M. majus is described in Section 5.2.3. As in the later assembly of M. majus, all three genes were found on a single contig, and there was no match for MAT1-1-1. The configuration of the MAT1 locus and flanking genes in M. nivale 11037 was identical to that predicted for M. majus 99049, with the exception of minor variations in sequence length (Figure

5.2).

291

Primers were designed for the APN2 (Mn_APN2_32R), SLA2 (Mn_SLA2_357F), and

MAT1-2-1 (Mn_MAT2_20F and Mn_MAT2_727R) genes identified in the genome of M. nivale isolate 11037. Following optimization, the SLA2 and APN2 primer set was used to successfully amplify a band of the predicted size (7 kb) in all of the M. nivale and M. majus isolates tested.

The MAT1-2-1 primers yielded a band of the predicted size (707 bp) from the sequenced M. nivale isolate 11037, but produced inconsistent results with many of the other M. nivale isolates tested; only 59% of the 92 isolates tested were amplified. There was no apparent relationship between an isolate's host plant or general geographic origin and its amplification by the MAT1-

2-1 primers. A summary of the number of isolates tested and the number of these which could be amplified using these primers may be found in Table 5.7. None of the M. majus isolates could be amplified with the M. nivale MAT1-2-1 primers.

The genome of M. nivale isolate 12262 was chosen for whole-genome sequencing

(Chapter 3) because this isolate was among the group of M. nivale isolates that could not be amplified using the primers designed from M. nivale 11037. The putative MAT1-2-1, APN2, and

SLA2 genes were identified in the draft genome assembly and in the predicted gene set by using both the sequences listed in Table 5.6 and the putative MAT1 locus and flanking genes identified in M. majus isolate 99049 and in M. nivale 11037. A summary of the similarity of these predicted genes to the best match among the sequences used to query against these sequences is found in Table 5.7.

When the genome of M. nivale isolate 11037 became available (February 2012), the putative MAT1-2-1, APN2, and SLA2 genes were identified in both the genome assembly and in the predicted gene set by using both the sequences listed in Table 5.6 and the putative MAT1 locus and flanking genes identified in M. majus isolate 99049. A description of the similarity of 292

these sequences to the putative homologs or orthologs in the other Microdochium genomes described is described in Section 5.2.3. As described for the other Microdochium genomes, all three genes were found on a single contig and shared the same synteny (Figure 5.2) with only minor variation in sequence length,. No match for MAT1-1-1 was observed.

The final sets of MAT1 locus and flanking gene primers were designed by combining the data from all three of the genome sequences available. The primers were Mic_SLA2_92F and

Mic_APN2_32R (spanning the region between the 3' end of SLA2 to the 5' end of APN2), and

Mic_MAT2_198F and Mic_MAT2_676R, which fell within the putative MAT1-2-1 sequence.

The "universal" MAT1-2-1 primers successfully amplified the putative MAT1-2-1 gene from all of the M. nivale and M. majus isolates with which they were tested, including both those which could and could not be amplified with the Mn_MAT2_20F and Mn_MAT2_727R primer set (designed using the March 2012 genome assembly of M. nivale isolate 11037). The SLA2 through APN2 region was also successfully amplified in all M. nivale and M. majus isolates tested using this "universal" primer set.

5.3.3 Mating experiments In the first mating experiment, isolates of M. majus and M. nivale (Table 5.9) were inoculated on sterilized wheat straw placed on PDA and incubated in the dark at 10 °C for 6 months. All possible crosses were performed with the isolates included, including self by self.

After 16 weeks, perithecia were visible on all of the plates inoculated with at least one isolate of

M. majus. In crosses including one isolate of M. majus and one isolate of M. nivale, perithecia were found only on half of the straw closest to the M. majus inoculum. In crosses including two different isolates of M. majus, or in plates including a self by self M. majus cross, perithecia were

293

found throughout the plant tissue. The perithecia were checked for asci and ascospores for an additional 10 weeks following the initial observation of perithecia, but for the remainder of the experiment, neither asci nor ascospores were observed.

In the second experiment, water agar, rather than PDA, was used as the medium in the

Petri dishes containing sterilized wheat straw. The isolates included in these experiments are summarized in Table 5.9. The isolates were plated on the wheat-straw plates in all possible crosses, including self by self. Four copies of each possible cross were prepared, and one copy of each plate was stored at 5, 10, 15, or 20 °C in the dark.

Table 5.10 summarizes the results of the mating crosses after two months of incubation.

In crosses containing two isolates of M. majus (including self by self crosses), and incubated at

20 °C, perithecia (Figure 5.4) were observed in 80% of the crosses, including in all four of the self by self crosses. Among the eight cases in which perithecia were produced, asci with ascospores were observed in six out of the eight crosses (Table 5.10). In plates incubated at 20

°C containing crosses of M. majus with M. nivale, perithecia were observed on twelve of the twenty plates prepared, and of the twelve plates with perithecia, asci and ascospores were observed in six cases. All four of the M. majus isolates produced perithecia and ascospores in at least one pairing, and when perithecia were produced, they were clearly located on the side of the wheat straw that was closest to the M. majus inoculum (Figure 5.4). Among the plates inoculated with M. nivale only, perithecia were produced on only two plates, and neither asci nor ascospores were observed.

The production of perithecia was observed less frequently at 15°C. At this temperature, perithecia were observed in only six of the ten plates inoculated with two isolates of M. majus, and of these six, ascospores were observed in only three. In the plates containing M. majus 294

crossed with M. nivale, only eight of the twenty plates contained perithecia, but of the plates with perithecia, ascospores were observed in eight. None of the plates inoculated with two isolates of

M. nivale yielded perithecia.

At 10 °C, only three of the plates inoculated with two isolates of M. majus yielded perithecia, and ascospores were observed in only one case. None of the M. majus by M. nivale plates yielded perithecia, but three of the plates inoculated with two isolates of M. nivale yielded perithecia, although ascospores were not observed in these structures. Neither ascospores nor perithecia were observed on any of the plates incubated at 5°C.

To confirm that the spores observed in the perithecial tissue were ascospores as opposed to conidia, their sizes were recorded. A total of ten ascospores for three of the self by self M. majus crosses were measured (Appendix 5.9). The putative ascospores were (12.0-)14.4 to 19.2

(-24.0) µm in length and (2.4-) 3.6 to 4.8(-6.0) µm in width, with 1-3 (predominantly 3) septa observed. The average size of the putative ascospores did not differ between different isolates.

The average conidial dimensions were also recorded for a total of eight mono-conidial colonies of M. majus to compare the average appearance of M. majus conidia to the putative ascospores.

For the eight isolates tested, the average dimensions of the conidia were 26 by 5 µm, with between 0-3 septa.

The DNA from ten single-ascospore (SA) cultures derived from the apparently homothallic M. majus isolate 99049, as well as DNA from this "parent" culture, was isolated and analysed using the Mic_MAT2_198F and Mic_MAT2_676R primer set to determine whether these spores contained the putative MAT1-2-1 gene identified in the other M. nivale and M. majus isolates examined. The DNA from all ten of the SA cultures, as well as the DNA from the parent culture, produced a single band of the predicted size (478 bp) in all of the samples, 295

suggesting that the putative MAT1-2-1 gene was indeed present. This DNA, in addition to DNA isolated from an additional seven M. majus isolates (Table 5.5), was also analysed using the

ISSR and SSR primers listed in Table 4.3. All seven of the other isolates were also successfully amplified by the Mic_MAT2_198F and Mic_MAT2_676R primer set. This subset of isolates were chosen because 99061 was collected at the same time and location as isolate 99049 (in

Atwood, ON, in 1999); isolates 12043-12046 were all collected from the same field (in Ottawa,

ON, in 2012); and isolates 10098 and 10099 were from Europe and thus intended as out-groups.

For the ISSR primers BHY(AGC)5 and DD(CCA)5 and for the SSR primer sets

MnSSR_CT5, MnSSR_GAT6, MnSSR_916061, and MnSSR_924078, identical banding patterns were observed with the DNA isolated from all 10 of the SA cultures as well as from the "parent" culture. For the ISSR primer (CAC)5, the DNA from eight of the ten SA cultures and the parent culture yielded a single band of approximately 700 bp. No band was observed for the DNA from the other two SA cultures. For the SSR primer MnSSR_GC5, five of the SA cultures yielded identical banding patterns with a total of four bands, while four of the SA cultures and the parent culture yielded a three-band pattern that was missing a band of approximately 1.6 kb that had been found in the four-band isolates. The remaining SA culture was not amplified by these primers. The relationships between the SA cultures, the parent culture, and the seven other isolates were visualized by constructing a UPGMA tree with 100 boostrap replications (Figure

5.6). In this tree, all SA cultures and the parent culture grouped into a single clade with a bootstrap value of 82%. The isolate collected at the same time as the parent culture (99061), did not group with these isolates. The four Ottawa isolates were grouped into a single clade with

37% bootstrap support that also included European isolate 10098.

296

5.3.4 Comparison of the Microdochium MAT1 locus to that of other species The synteny of the MAT1 locus and the flanking genes was assessed using whole- genome data from Microdochium nivale, M. majus, M. bolleyi, and seven other species within the Xylariales (Table 5.11). In addition to the flanking genes SLA2 and APN2 described above, the genes APC5 and COX13, also associated with the MAT1 locus in many species (Butler

2007), were also identified. The sequences used to query against these genomes are summarized in Table 5.6.

Putative sequences for all four of the flanking genes and MAT1-2-1 were identified in all

13 genomes examined. Unlike the Microdochium genomes, with the exception of Pestalotiopsis sp., the putative MAT1-2-1 sequences identified in the other Xylariales genomes were not located in proximity to the putative flanking genes. Only very weak matches for MAT1-1-1 and

MAT1-1-2 (e-values of 0.02 or greater) were identified in any of the Xylariales genomes. The

MAT1-1-3 query sequence, which contains an HMG box, had at least two matches with an e- value less than 1e-05 in all of the species examined, including the Microdochium genomes. The same matches were identified for the MAT1-2-1 query sequence. For all 12 genomes, all of the sequences identified as putative MAT1-2-1 and / or MAT1-1-3 genes were queried against the

GenBank database (BLASTx) (Table 5.11). For all twelve genomes, the best matches for these sequences were putative HMG-box containing proteins. For those species where the putative match to MAT1-1-3 / MAT1-2-1 was distant from any of the flanking genes, it was not possible to determine whether the sequence identified was likely one of these two mating-type genes or if it was more likely to be a different HMG-box containing protein. In the Microdochium genomes, as well as in P. theae, the HMG-box containing protein found between the flanking genes was

297

tentatively assigned the identity of a putative mating-type gene. The apparent absence of MAT1-

1-1 led to the classification of this mating type gene as MAT1-2-1.

All of the sequences identified as matches to the four flanking genes were also further investigated (Table 5.11) to determine whether there were any putative duplicated sequences for the four flanking genes. Although there was more than one match for each of the flanking genes in many of the genomes studied, in most cases these additional sequences appeared to be unrelated to the flanking genes. In both M. majus 99049 and M. majus 10095, secondary matches

(in addition to those found in proximity to the putative MAT region) for cytochrome c oxidase were identified on a separate scaffold; however, in both cases, this sequence was not found in the proximity of any of the other secondary matches to the other MAT-related genes identified. A similar trend was observed for M. nivale isolate 12262, where a secondary match for APC5 was identified on a separate scaffold relative to the other sequences of interest.

The relative genomic organization of the proposed mating region, including the flanking sequences, for all of the genomes studied is depicted in Figure 5.2. The orientation of the flanking genes relative to the putative MAT gene was conserved in all of the Microdochium spp. studied, but differed in all seven of the other genomes studied, including the three Hypoxylon sp. genomes.

5.4 Discussion

In this Chapter, the putative mating type regions of Microdochium nivale, M. majus, and

M. bolleyi were identified from the whole-genome data from these three species based on comparison to other published sequences. Primers were designed from the sequences identified and were used to screen the isolate collections of M. nivale and M. majus, and mating crosses 298

were performed in an attempt to further explore the conditions required for sexual reproduction to occur.

Based on earlier reports that M. majus and M. nivale are homothallic, both MAT1-1-1 and MAT1-2-1 were predicted to exist within the genomes of all isolates of these species. The initial search for the MAT genes was conducted using the published primer sets Falpha and

ChHMG in an attempt to amplify MAT1-1-1 and MAT1-2-1, respectively, from the DNA of M. nivale and M. majus. However, inconsistent results were obtained with these primers, and a single band of the predicted size was not produced by either primer set. The lack of amplification with these primers could be explained by considering that MAT genes are highly variable at the nucleotide level, and that Microdochium is a member of the Xylariales, whereas the MAT1-1-1 primers were designed for use in Fusarium sp. (a member of the Hypocreales), and the MAT1-2-

1 primers were designed for use in Colletotrichum higginsianum (Chen et al. 2002), a member of the Glomerellales.

To address these issues, several sets of "conserved" MAT1-1-1 and MAT1-2-1 primers were then designed based on MAT1-1-1 and MAT1-2-1 sequences collected from GenBank. The goal was to design new primers that were biased towards species that may be more closely related to Microdochium than those that were used to design the previously-tested primers.

Although some of the primers did amplify bands of varying sizes from M. nivale and M. majus, when these bands were sequenced, they displayed low homology with any of the available sequences in the GenBank database and shared no apparent relationship with the MAT sequences that had been used to design the primers. These results suggested that these primers were probably annealing to sequences in the genome that are unrelated to the sequences of

299

interest. Based on these inconsistent and disappointing results, the search for MAT genes was postponed until the whole-genome data became available (Chapter 3).

When the genome sequencing data of M. majus isolate 99049 were first assembled (May

2011), a putative MAT1-2-1 sequence was identified on a single contig based on its homology to the query sequences used. Surprisingly, no putative match to MAT1-1-1 was identified; however, especially with this single and preliminary genome sequence, it was not possible to determine whether the MAT1-1-1 gene was truly missing, whether it was merely situated in a poorly-assembled region of the genome, or whether it bore too little homology to the query sequences to be identified in this manner.

Despite this confusion, primers were designed to amplify this putative MAT1-2-1 sequence. The primer set produced a single band of the predicted size from all of the M. majus isolates tested, but did not yield consistent results with DNA from M. nivale. Sequencing the

PCR product from one of the M. majus isolates produced a sequence that was identical to the putative M. majus MAT1-2-1 sequence identified from the genome. Although several different sets of conditions were tested, the M. majus MAT1-2-1 primers did not produce bands of the predicted size from DNA of M. nivale isolates. This inability to amplify the gene of interest was likely due to sequence differences between M. nivale and M. majus, and the genomic sequences showed that there were polymorphisms in the priming region.

In an attempt to investigate the MAT region of M. nivale before its genome sequence became available, a strategy employed by other researchers (Putman et al. 2011) was employed.

In this method, the genes APN2 and SLA2, predicted to be located immediately up- and down- stream, respectively, from the putative MAT1-2-1 sequence (Figure 5.1), were identified in the

M. majus genome. The presence of these genes within the vicinity of the putative MAT1-2-1 300

sequence also affirmed that this was the MAT locus in M. majus. Although sequences sharing high homology to both sequences of interest were identified, all three sequences of interest (the two flanking genes and the putative MAT1-2-1) were located on different contigs in the May

2011 assembly of M. majus isolate 99049, and thus the synteny of these genes relative to that in other species could not be assessed from the genomic data alone. Regardless, based on the gene order in other species (Figure 5.1), one (forward) primer was designed near the 3' end of the

APN2 sequence, and a second (reverse) near the 5' end of the SLA2 sequence, with the goal of amplifying the region spanning from the tail end of the first flanking gene through to the beginning of the second, which was hypothesized to encompass the putative MAT1-2-1 sequence. Because the sequences of the flanking genes are more strongly conserved than those of the MAT genes (Putman et al. 2011), this method was also designed to provide access to the M. nivale MAT1 region. Based on other species, this region was expected to be between 4 to 10 kb in length.

Unexpectedly, the attempted amplification of the full region between APN2 through to

SLA2 was unsuccessful with M. majus DNA, including that of the sequenced isolate 99049, even when the Expand LT enzyme system, designed to amplify fragments up to 20 kb, was used.

Amplification from the 5' end of the putative MAT1-2-1 to the 5' end of SLA2, and from the 3' end of APN2 through the 3' end of MAT1-2-1, was also attempted based on the hypothesis that the full APN2 through SLA2 sequence may be longer than predicted. It was expected that at least one of these two pairings might provide a sequence short enough for successful amplification, but no product was ultimately obtained from either pairing.

When the genome of M. majus isolate 99049 was re-assembled (December 2011, Chapter

3), the previously-identified putative sequences of APN2, MAT1-2-1, and SLA2 were found on a 301

single scaffold. However, unlike in other species, the order of these genes was different from the prediction: SLA2, rather than APN2, was found to be upstream of the putative MAT1-2-1 gene and, relative to the original contig studied, the APN2 sequence was reversed. To confirm this unexpected observation, a new primer was designed at the 3' end of SLA2, with the intention of amplifying the region between SLA2 through to APN2. As the putative APN2 sequence was reversed relative to the original contig, the previously-designed APN2 primer was used as a reverse primer. These new primers successfully amplified an 8 kb fragment from all of the M. majus isolates available in the local lab collection, but still failed to amplify DNA from M. nivale.

However, the successful PCR result confirmed the unexpected relative positions of the flanking genes and the putative MAT1-2-1 gene in M. majus. The overall size of this region is in agreement with the size of this region in other species. No other genes were detected in the intergenic regions between the putative MAT1-2-1 gene and the flanking genes. When predicted gene sets were prepared using the whole-genome data, both of the flanking genes and the putative MAT1-2-1 sequences identified previously were found within the predicted genes, further confirming that these sequences were indeed the desired genes of interest. In addition, as predicted from the scaffold sequences, no predicted genes were found between the flanking genes and the putative MAT1-2-1. None of the predicted genes displayed homology to MAT1-1-

1.

When the whole-genome data for M. nivale isolate 11037 became available (February

2012), the putative SLA2, APN2, and MAT1-2-1 genes were identified on a single contig. These three genes shared the same synteny as in M. majus 99049. The predicted protein set for this isolate also contained all three of the sequences identified in the scaffolds, and did not contain 302

any predicted genes between either flanking gene and the putative MAT1-2-1. When primers were designed using the M. nivale 11037 putative MAT1-2-1 sequence as a template, only 59% of the M. nivale isolates screened yielded the predicted results. There was no apparent pattern among which isolates could or could not be amplified with respect to the isolate's host plant or continent of origin. However, those isolates that could be amplified with the MAT1-2-1 primers could also be amplified with the flanking primers (and vice-versa); similarly, the failure to amplify the MAT1-2-1 gene predicted the inability to amplify the flanking genes. This relationship led to the development of the hypothesis that those M. nivale isolates that could not be amplified may possess the MAT1-1-1 gene, rather than MAT1-2-1, and that the non- amplification of the flanking genes spanning the MAT1 cassette may be indicative of a size difference between this region in the amplifying and non-amplifying isolates. This hypothesis also agreed with the hypothesis that at least some isolates of M. nivale may be heterothallic (Lees et al. 1995; Mahuku et al. 1998). To investigate this hypothesis, the M. nivale isolate 12262

(which could not be amplified by any of the existing MAT-region primers) was submitted for genome sequencing.

When the genome sequence of M. nivale isolate 12262 was obtained, putative sequences for SLA2, MAT1-2-1, and APN2 were identified on both a single scaffold and in the predicted gene sets. In contrast to the hypothesis, MAT1-1-1 was not identified in the genome of this isolate. The sequences of these genes were aligned for all three of the sequenced Microdochium isolates (Appendix 5.3, Appendix 5.4, Appendix 5.5), and the priming regions for all of the primers designed and ordered to date were investigated. For all three genes, the sequence obtained from M. nivale isolate 12262 was more similar to that from M. majus 99049 than to M. nivale 11037. At least one mismatch for the other species was detected within the priming region 303

of all of the primers ordered previously. For the two M. nivale isolates, the SLA2 primer had only a single mismatch at a position 5 base pairs from the 3' end of the primer; when the scaffold sequences were compared, the full length of the region from the 5' end of SLA2 to the 3' end of

APN2 (which includes the putative MAT1-2-1) differed by only 30 bp between these two isolates.

The inability of the flanking genes to amplify this region in M. nivale isolate 12262 (and other isolates), is perhaps better-explained by polymorphism in the 3' end of the priming site.

The full length of the region between the SLA2 and APN2 primers was 7,420 bp based on the scaffold sequence; if the activity of the enzyme were impaired by a poorly-available priming site caused by this mismatch in the primer, lack of amplification of the full, desired sequence would result. Primers designed based on the alignment of all three of the primer sequences

(Mic_MAT2_198F and Mic_MAT2_676R for the MAT1-2-1 gene and Mic_SLA2_92F and

Mic_APN2_32R for the flanking genes) successfully amplified DNA from all of the M. nivale and M. majus isolates tested. Based on this result, all of the M. nivale and M. majus isolates in the local lab collection appear to possess the MAT1-2-1 gene. The dissimilarity of the MAT1-2-

1 genes identified in M. nivale isolates 12262 and (later) 10106, compared to that from isolate

11037 suggests that there may be two distinct alleles of the MAT1-2-1 gene among this species.

Ultimately, obtaining information about the MAT genes in M. nivale and M. majus by sequencing the genomes of these species was both technically simpler, less time-consuming and less expensive than trying to obtain this information using published primers from other genera.

The discovery of MAT1-2-1 as apparently the only mating-type gene in M. nivale and M. majus was unexpected based on the reported ability of these species to undergo homothallic reproduction. The apparent absence of MAT1-1-1 from the genomes of the sequenced isolates 304

seems to directly contradict this claim since conventional knowledge holds that both mating types are required for ascospore production (Butler 2007), although some exceptions to this trend are known (e.g. (Menat et al. 2012)). The amplification of MAT1-2-1 from all of the M. nivale isolates tested, including those from both wheat and turf and from both Europe and North

America suggests that, if these isolates of this species are heterothallic, they all possess the same mating type. One explanation for this imbalance could be the rarity of MAT1-1-1 in the field. As only 12 isolates of M. nivale from Europe were available for analysis, it is possible that, due to simple chance, none of these isolates possessed MAT1-1-1. If this mating type truly is uncommon in Europe, this imbalance could also have resulted in the complete absence of this mating type in North America if the founding population of this species was small. This lack of

MAT1-1-1 in North America could also explain why perithecia have not been observed on grasses (Lees et al. 1995). Although the rarity of one mating type among populations of heterothallic species has been reported (e.g. (Linde et al. 2003)), this hypothesis does not agree with some researchers' observations of perithecia and ascospores for M. nivale (e.g.(Lees et al.

1995)), and also contrasts with the high genetic variability of this species that has been reported

(Lees et al. 1995; Mahuku et al. 1998). Further implications and possible explanations for these observations are discussed below, in the context of the multi-species mating type analysis, as well as in Chapter 7.

To explore the surprising and contradictory observations made using the genomic information, isolates of M. nivale and M. majus were inoculated onto wheat straw on either PDA or water agar and incubated at 5, 10, 15, or 20 °C. Perithecia with ascospores were observed on several of the M. majus plates after only six weeks of incubation on water agar. Ascospores were produced both for single-spore isolates in self by self crosses and on plates that contained two 305

different isolates; however, there were no indications that the ascospores produced were the result of mating between the two isolates. In contrast, in most cases, there was a clear demarcation in the approximate centre of the wheat straw, suggesting that the perithecia were produced by one isolate alone (Figure 5.3). As all isolates used were derived from single conidia, and because the conidia of most Ascomycetes contain only a single nucleus (Webster and Weber

2007), these observations are in agreement with previous reports that M. majus is homothallic.

The putative ascospores identified in all of the apparent mating events were examined by measuring their size and recording their appearance. In all cases, these apparent ascospores matched the reported descriptions of ascospores from M. nivale sensu lato, and they were readily distinguished from conidia based on their "lumpier" appearance and their short length (10-17 µm

(Booth 1971) compared to 15-33 µm (Glynn et al. 2005)) relative to conidia from the same isolate. When the conidial dimensions of eight M. majus isolates were recorded, the average size observed for the isolates tested was larger than that of the putative ascospores, confirming that the spores observed from the wheat-incubated colonies were truly ascospores. The genotypes of

SA-derived cultures were compared to their "parent" isolate using SSR and ISSR. Seven additional isolates of M. majus, including one collected at the same time and from the same location as the parent isolate, were included in this analysis. The SSR and ISSR primers chosen for this experiment were those that produced different banding patterns for the M. nivale samples collected across three years (Chapter 4), and the results of this analysis were visualized by constructing a UPGMA tree with 100 bootstrap replicates.

Despite minor differences in the banding patterns produced by the eight SSR and ISSR primers, the ten SA isolates and their parent culture were more closely related to each other than to any of the other isolates included in this study. The four Ottawa isolates, 12043-12046, were 306

found in a single clade that also included one of the European isolates. In addition, the Atwood isolate 99061, which was collected on the same date and from the same location as isolate 99049, was not closely grouped with any of the 99049 isolates included. These results show that the ten

SA cultures were indeed highly similar to their parent culture 99049. This result is in agreement with the hypothesis that these isolates are the product of homothallic mating by the sequenced M. majus isolate 99049, despite the apparent absence of the MAT1-1-1 gene in the genome of this or any other Microdochium isolate examined.

The temperature of incubation and the nutrient content of the media were found to play an important role in the production of perithecia, and possibly also the formation of ascospores.

Ascospores were only observed among isolate crosses incubated on water agar, rather than on

PDA; this suggests that ascospore production may be a direct response to nutrient starvation. In contrast, the production of ascospores was most frequent at 20 °C, a temperature at which these fungi grow rapidly in culture (Tronsmo et al. 2001). At 5 °C, a temperature that is below the optimum in-lab growth temperature but well within the range of temperatures at which these species cause disease (Smiley et al. 2005), ascospore production was not observed.

Together, these observations suggest that the production of ascospores in the field may be most common when these fungi are not actively causing disease symptoms. Although these fungi grow readily on both artificial media and plant tissue in the lab at 20 °C, in the field, these temperatures also favour the growth of several other pathogenic fungi that are active on the same hosts (Smiley et al. 2005). It is thus possible that ascospores and / or perithecia may provide an alternate mechanism for over-summering. In 1993, Koizuma (in Tronsmo et al. 2001) reported that perithecia may survive on plant matter on the soil for up to three months; however, to

307

investigate the role that ascospores may play as a reservoir of inoculum future , the longevity of ascospores under a variety of conditions should be investigated.

Based on the results observed from the in-lab mating experiments, which corroborate published reports that M. majus (and likely M. nivale) are homothallic, the MAT1 and flanking genes from several non-Microdochium species from the Xylariales were investigated. The goal of this experiment was to determine whether a putative MAT1-1-1 sequence could be identified in any of these species. If a Xylariales MAT1-1-1 could be identified, this sequence might facilitate the identification of MAT1-1-1 from M. nivale and / or M. majus. The synteny of the flanking genes was also investigated to determine whether the unusual layout observed in M. majus and M. majus was specific to this genus, or whether it is conserved among the Xylariales.

For these reasons, the MAT1 locus and flanking genes in seven non-Microdochium genomes were analysed. An additional three Microdochium genomes (M. nivale isolate 10106,

M. majus isolate 10095, and M. bolleyi isolate 07020) that became available at this time were also included in this analysis. The "flanking region" in this study was expanded to include two additional genes, APC5 (anaphase-promoting complex) and COX13 (cytochrome c co-oxidase) to facilitate the identification of any further patterns of organization among the genes in this region.

In all thirteen of the Xylariales genomes studied, the four flanking genes and at least two

HMG-box-containing sequences were identified. None of the genomes contained a putative match for MAT1-1-1. In the six Microdochium spp. genomes, the synteny of the MAT1 locus and the four flanking genes was identical; however, this relationship was only conserved in

Pestalotiopsis sp. (Figure 5.2). In the other Xylariales genomes, the only trend observed was that

APC5 and SLA2 were always in close proximity to each other, as were APN2 and COX13. In 308

Daldinia eschscholtzii, all four genes were in very close proximity with only 3.4 kb separating

APN2 and SLA2; in Annulohypoxylon stygium and two of the three Hypoxylon sp. genomes

(genomes "EC" and "CI"), this distance was greater than 10 kb, and in the third Hypoxylon sp. genome, the two groups of flanking genes were found on separate scaffolds. In these five cases, all of the putative HMG-containing sequences identified were also on separate scaffolds relative to any of the flanking genes. In P. neglecta, all four of the flanking genes and all of the HMG- box-containing sequences were found on separate contigs. In those species where the HMG-box containing sequences were distant from the flanking genes, it was not possible to determine which, if any, of the sequences identified might be MAT1-2-1. Turgeon and Yoder (2000) recognized the difficulty in distinguishing between MAT1-1-3 and MAT1-2-1, and suggested that the presence of the other MAT1-1 genes (MAT1-1-1 and MAT1-1-2) in "true" MAT1-1 strains may be a useful diagnostic. However, likely homologs for neither MAT1-1-1 nor MAT1-

1-2 were found in any of the Microdochium spp. genomes studied.

The lack of synteny among the other Xylariales species relative to the Microdochium spp. and Pestalotiopsis sp. MAT1 loci and flanking regions led to a detailed investigation of all of the matches identified for all of the genomes studied. The search was performed to clarify whether any of the secondary matches for the flanking genes represented truncated or otherwise non- functional duplicate copies of the genes of interest. However, no putative duplicates were detected based on a comparison of these candidate sequences to the GenBank database

(BLASTx).

Although few reports on sexual reproduction among the Xylariales are available, those species that do produce perithecia in the lab are reported to be homothallic (Rogers 1979). This general observation is in line with the proposed mode of sexual reproduction in M. nivale and M. 309

majus. The apparent lack of MAT1-1-1 in all of the Xylariales examined was thus unexpected.

One possible explanation for the apparent discrepancy is that MAT1-1-1 truly is found in the genomes of all homothallic Xylariales, but that the sequence of this gene among this order is simply dissimilar from the MAT1-1-1 sequences previously reported, making it difficult to find with the tools and information that were used in these experiments. The proposed sequence divergence may be explained by the apparently basal relationship of the Xylariales to the other members of the Sordariomycetes (Schoch et al. 2009). The proposed dissimilarity may also explain the unusual and generally inconsistent arrangement of the flanking genes and putative

MAT1 loci of these species relative to that of other Sordariomycetes. If this hypothesis is true, the yet-unidentified MAT1-1-1 gene is most likely found in a location that is physically distant from any of the genes that are "traditionally" identified as syntenic to the MAT1 cassette, because the regions 10 kb immediately up- and down-stream of the flanking genes were also searched, unsuccessfully, for sequences resembling MAT1-1-1. In some homothallic ascomycetes where the MAT genes are distant from one another, one gene may be surrounded by both of the flanking genes while the other is proximate to at least a partial sequence of one of the two genes (e.g. (Pöggeler et al. 2011)). However, neither complete nor truncated duplicate sequences for any of the four flanking genes were identified in the genomes of any of the

Xylariales examined, which complicates future efforts to identify a MAT1-1-1 sequence among the Xylariales.

A more intriguing explanation is that the MAT1-1-1 sequence truly is absent in these genomes, and that mating in the Xylariales may not be under the exclusive control of the

"canonical" MAT genes MAT1-1-1 and MAT1-2-1. This hypothesis is strengthened by recent reports of unconventional mating among other Ascomycota, including members of the genera 310

Glomerella (Chen et al. 2002; Menat et al. 2012; Rodriguez-Guerra et al. 2005; Vaillancourt et al. 2000) and Neurospora (Lin and Heitman 2007) In those Neurospora species displaying apparent homothallism but possessing only a single MAT gene, only MAT1-1-1 was identified

(Lin and Heitman 2007). However, in G. graminicola, strains displaying a complex mixture of self- and inter-fertility and sterility that could not be readily classified as homo- or heterothallism apparently contained only MAT1-2-1 (Vaillancourt et al. 2000). This situation is similar to that observed for the Xylariales examined in this Chapter, especially M. nivale, which apparently possessed two distinct alleles of the putative MAT1-2-1 gene (exemplified by the genes from isolates 12262 and 10106 vs. that of 11037). At this stage, a complete explanation of the mating behaviour in M. nivale and M. majus is not possible.

The results presented in this Chapter underscore the importance of corroborating genetic data with field and laboratory observations of living samples, and vice versa. Apparently homothallic mating was observed among isolates of M. majus, which produced fertile ascospores, but the few isolates of M. nivale that produced perithecia failed to produce ascospores. An examination of genomic data for three isolates of M. nivale, two isolates of M. majus, and a total of eight other members of the Xylariales failed to reveal the presence of a second MAT gene, suggesting that either the nucleotide sequence of the MAT1-1-1 gene among the Xylariales is so different from that of other Sordariomycetes that it is unrecognizable, or that this gene may be missing altogether from this order, and that the control of mating in this group may be unlike that in other taxa. Further research is necessary to explore these possible explanations.

311

5.5 References for Chapter 5

Amselem, J., Cuomo, C.A., van Kan, J.A.L., Viaud, M., Benito, E.P., Couloux, A., Coutinho, P.M., de Vries, R.P., Dyer, P.S., Fillinger, S., Fournier, E., Gout, L., Hahn, M., Kohn, L., Lapalu, N., Plummer, K.M., Pradier, J.-M., Quévillon, E., Sharon, A., Simon, A., ten Have, A., Tudzynski, B., Tudzynski, P., Wincker, P., Andrew, M., Anthouard, V., Beever, R.E., Beffa, R., Benoit, I., Bouzid, O., Brault, B., Chen, Z., Choquer, M., Collémare, J., Cotton, P., Danchin, E.G., Da Silva, C., Gautier, A., Giraud, C., Giraud, T., Gonzalez, C., Grossetete, S., Güldener, U., Henrissat, B., Howlett, B.J., Kodira, C., Kretschmer, M., Lappartient, A., Leroch, M., Levis, C., Mauceli, E., Neuvéglise, C., Oeser, B., Pearson, M., Poulain, J., Poussereau, N., Quesneville, H., Rascle, C., Schumacher, J., Ségurens, B., Sexton, A., Silva, E., Sirven, C., Soanes, D.M., Talbot, N.J., Templeton, M., Yandava, C., Yarden, O., Zeng, Q., Rollins, J.A., Lebrun, M.-H., and Dickman, M. 2001. Genomic Analysis of the Necrotrophic Fungal Pathogens Sclerotinia sclerotiorum and Botrytis cinerea. PLoS Genetics 7(8): e1002230. Arie, T., Christiansen, S.K., Yoder, O.C., and Turgeon, B.G. 1997. Efficient cloning of ascomycete mating type genes by PCR amplification of the conserved MAT HMG box Fungal Genetics and Biology 21: 118-130. Arie, T., Kaneko, I., Yoshida, T., Noguchi, M., Nomura, Y., and Yamaguchi, I. 2000. Mating- Type Genes from Asexual Phytopathogenic Ascomycetes Fusarium oxysporum and Alternaria alternaria. Molecular Plant-Microbe Interactions 13: 1330-1339. Arie, T., Yoshida, T., Shimizu, T., Kawabe, M., Yoneyama, K., and Yamaguchi, I. 1999. Assessment of Giberella fujikuroi mating type by PCR. Mycoscience 40: 311-314. Bennett, F.T. 1933. Fusarium species on British Cereals. Annals of Applied Biology 20(2): 272- 290. Bianchi, M.E., and Agresti, A. 2005. HMG proteins: dynamic players in gene regulation and differentiation. Current Opinion in Genetics and Development 15(5): 496-506. Booth, C. 1971. Micronectriella nivalis. In CMI Descriptions of Pathogenic Fungi and Bacteria No. 309. Commonwealth Mycological Institute, Kew. Butler, G. 2007. The Evolution of MAT: The Ascomycetes. In Sex in Fungi: Molecular Determination and Evolutionary Implications. Edited by J. Heitman, Kronstad, J.W., Taylor, J.W., and Casselton, L.A. ASM Press, Washington, D.C. Butler, G., Kenny, C., Fagan, A., Kurischko, C., Gaillardin, C., and Wolfe, K.H. 2004. Evolution of the MAT locus and its Ho endonuclease in yeast species. Proceedings of the National Academy of Sciences of the U.S.A. 101: 1632-1637. Casselton, L.A. 2008. Fungal sex genes - searching for the ancestors. BioEssays 30(8): 711-714. Chen, F., Goodwin, P.H., Khan, A., and Hsiang, T. 2002. Population structure and mating-type genes of Colletotrichum graminicola from Agrostis palustris. Canadian Journal of Microbiology 48: 427-436. Cook, R.J., and Bruehl, G.W. 1966. Calonectria nivalis, perfect stage of Fusarium nivale, occurs in the field in North America. Phytopathological Notes 54: 1100-1101. Coppin, E., Debuchy, R., Arnaise, S., and Pickard, M. 1997. Mating types and sexual devleopment in filamentous ascomycetes. Microbiology and Molecular Biology Reviews 61: 411-428.

312

Dubin, R.A., and Ostrer, H. 1994. Sry is a transcriptional activator. Molecular Endocrinology 8(9): 1182-1192. Ferreira, A.V., An, Z., Metzenberg, R.L., and Glass, N.L. 1998. Characterization of mat A-2, mat A-3 and deltamatA mating-type mutants of Neurospora crassa. Genetics 148: 1069-1079. Glynn, N.C., Hare, M.C., Parry, D.W., and Edwards, S.G. 2005. Phylogenetic analysis of EF-1 alpha gene sequences from isolates of Microdochium nivale leads to elevation of varieties majus and nivale to species status. Mycological Research 109: 872-880. Gordon, W.L. 1952. The occurrence of Fusarium species in Canada: II. Prevalence and taxonomy of Fusarium species in cereal seed. Canadian Journal of Botany 30(2): 209- 251. Günther, S., Piwon, N., and WIll, H. 1998. Wild-type levels of pregenomic RNA and replication but reduced pre-C RNA and e-antigen synthesis of hepatitis B virus with C(1653) --> T, A(1762) --> T and G(1764) --> A mutations in the core promoter. Journal of General Virology 79: 375-380. Hibbett, D.S., Binder, M., Bischoff, J.F., Blackwell, M., Cannon, P.F., Eriksson, O.E., Huhndorf, S., James, T., Kirk, P.M., Lücking, R., Thorsten Lumbsch, H., Lutzoni, F.o., Matheny, P.B., McLaughlin, D.J., Powell, M.J., Redhead, S., Schoch, C.L., Spatafora, J.W., Stalpers, J.A., Vilgalys, R., Aime, M.C., Aptroot, A., Bauer, R., Begerow, D., Benny, G.L., Castlebury, L.A., Crous, P.W., Dai, Y.-C., Gams, W., Geiser, D.M., Griffith, G.W., Gueidan, C.c., Hawksworth, D.L., Hestmark, G., Hosaka, K., Humber, R.A., Hyde, K.D., Ironside, J.E., Kõljalg, U., Kurtzman, C.P., Larsson, K.-H., Lichtwardt, R., Longcore, J., MiÄ…dlikowska, J., Miller, A., Moncalvo, J.-M., Mozley-Standridge, S., Oberwinkler, F., Parmasto, E., Reeb, V.r., Rogers, J.D., Roux, C., Ryvarden, L., Sampaio, J.P., Schüßler, A., Sugiyama, J., Thorn, R.G., Tibell, L., Untereiner, W.A., Walker, C., Wang, Z., Weir, A., Weiss, M., White, M.M., Winka, K., Yao, Y.-J., and Zhang, N. 2007. A higher-level phylogenetic classification of the Fungi. Mycological Research 111(5): 509-547. Klix, V., Nowrousian, M., Ringelberg, C., Loros, J.J., Dunlap, J.C., and Pöggeler, S. 2010. Functional characterizationof MAT1-1-specific mating-type genes in the homothallic ascomycete Sordaria macrospora provies new insights into essential and nonessential sexual regulators. Eukaryotic Cell 9(6): 894-905. Lees, A.K., Nicholson, P., Rezanoor, H.N., and Parry, D.W. 1995. Analysis of variation within Microdochium nivale from wheat - evidence for a distinct subgroup. Mycological Research 99: 103-109. Lin, X., and Heitman, J. 2007. Mechanisms of homothallism in fungi and transitions between heterothallism and homothallism. In Sex in Fungi: Molecular Determination and Evolutionary Implications. Edited by J. Heitman, J.W. Kronstad, J.W. Taylor, and L.A. Casselton. ASM Press, Washington, DC. Linde, C.C., Zala, M., Ceccarelli, S., and McDonald, B.A. 2003. Further evidence for sexual reproduction in Rhynchosporium secalis bsed on distribution and frequency of mating- type alleles. Fungal Genetics and Biology 40: 115-125. Litschko, L., and Burpee, L.L. 1987. Variation among isolates of Microdochium nivale collected from wheat and turfgrasses. Transactions of the British Mycological Society 89: 252-256. Mahuku, G.S., Hsiang, T., and Yang, L. 1998. Genetic diversity of Microdochium nivale isolates from turfgrass. Mycological Research 102: 559-567. 313

Menat, J., Cabral, A.L., Vijayan, P., Wei, Y., and Banniza, S. 2012. Glomerella truncata: another Glomerella species with an atypical mating system. Mycologia 104(3): 641-649. Merino, S.T., Nelson, M.A., Jacobson, D.J., and Natvig, D.O. 1996. Pseudohomothallism and evolution of the mating-type chromosome in Neurospora tetrasperma. Genetics 143: 789-799. Metzenberg, R.L., and Glass, N.L. 1990. Mating type and mating strategies in Neurospora Bioessays 12: 53-59. Parry, D.W., Rezanoor, H.N., Pettitt, T.R., Hare, M.C., and Nicholson, P. 1995. Analysis of Microdochium nivale isolates from wheat in the UK during 1993. Annals of Applied Biology 126(3): 449-455. Pöggeler, S. 2000. Two pheromone precursor genes are transcriptionally expressed in the homothallic ascomycete Sordaria macrospora. Current Genetics 37: 403-411. Pöggeler, S., and Kuck, U. 2000. Comparative analysis of the mating-type loci from Neurospora crassa and Sordaria macrospora: identification of novel transcribed ORFs. Molecular and General Genetics 263: 292-301. Pöggeler, S., O'Gorman, C.M., Hoff, B., and Kuck, U. 2011. Molecular organization of the mating-type loci in the homothallic Ascomycete Eupenicillium crustaceum. Fungal Biology 115: 615-624. Putman, A.I., Carbone, I., and Tredway, L.P. 2011. Characterization and distribution of mating type genes in Sclerotinia homoeocarpa populations. In American Phytopathological Society Annual Meeting. APS, Honolulu, Hawaii, USA Rodriguez-Guerra, R., Ramirez-Rueda, M.-T., Cabral-Enciso, M., Garcia-Serrano, M., Lira- Maldonado, Z., Gevara-Gonzalez, R.G., Gonzalez-Chavira, M., and Simpson, J. 2005. Heterothallic mating observed between Mexican isolates of Glomerella lindemuthiana. Mycologia 97(4): 793-803. Rogers, J.D. 1979. The Xylariaceae: Systematic, biological, and evolutionary aspects. Mycologia 71(1): 1-42. Rydholm, C., Dyer, P.S., and Lutzoni, F. 2007. DNA sequence characterization and moleclar evolution of MAT1 and MAT2 mating-type loci of the self-compatible ascomycete mold Neosartorya fischeri Eukaryotic Cell 6(5): 868-874. Schaffnit, E. 1913. Zur systematik von Fusarium nivale bzw. seiner hoheren fruchtform. Mycologisches Centralblatt 2: 253-258. Schoch, C.L., Sung, G.H., Lopez-Giraldez, F., Townsend, J.P., Miadlikowska, J., Hofstetter, V., Robbertse, B., Matheny, P.B., Kauff, F., Wang, Z., Gueidan, C., Andrie, R.M., Trippe, K., Ciufetti, L.M., Wynns, A., Fraker, E., Hodkinson, B.P., Bonito, G., Groenewald, J.Z., Arzanlou, M., de Hoog, G.S., Crous, P.W., Hewitt, D., Pfister, D.H., Peterson, K., Gryzenhout, M., Wingfield, M.J., Aptroot, A., Suh, S.O., Blackwell, M., Hillis, D.M., Griffith, G.W., Castlebury, L.A., Rossman, A.Y., Lumbsch, H.T., Lucking, R., Budel, B., Rauhut, A., Diederich, P., Ertz, D., Geiser, D.M., Hosaka, K., Inderbitzin, P., Kohlmeyer, J., Volkmann-Kohlmeyer, B., Mostert, L., O'Donnell, K., Sipman, H., Rogers, J.D., Shoemaker, R.A., Sugiyama, J., Summerbell, R.C., Untereiner, W., Johnston, P.R., Stenroos, S., Zuccaro, A., Dyer, P.S., Crittenden, P.D., Cole, M.S., Hansen, K., Trappe, J.M., Yahr, R., Lutzoni, F., and Spatafora, J.W. 2009. The Ascomycota Tree of Life: A Phylum-wide Phylogeny Clarifies the Origin and Evolution of Fundamental Reproductive and Ecological Traits. Systematic Biology 58(2): 224-239. 314

Smiley, R.W., Dernoeden, P.H., and Clarke, B.B. 2005. Compendium of Turfgrass Diseases, 3rd Ed. American Phytopathological Society, St. Paul, MN. Smith, J.D. 1983. Fusarium nivale (Gerlachia nivalis) from cereals and grasses - is it the same fungus? Canadian Plant Disease Survey 63(1): 25-26. Staben, C., and Yanofsky, C. 1990. Neurospora crassa a mating-type region. Proceedings of the National Academy of Sciences of the U.S.A. 87: 4917-4921. Stevens. 1918. Some Meliolicolous parasites and commensals from Porto Rico. In The Botanical Gazette. Edited by J.M. Coulter. The University of Chicago Press, Chicago, Illinois. Taylor, J.W., Jacobson, D.J., and Fisher, M.C. 1999. The evolution of asexual fungi: reproduction, speciation, and classification. Annual Review of Phytopathology 37: 197- 246. Tronsmo, A.M., Hsiang, T., Okuyama, H., and Nakajima, T. 2001. Low temperature diseases caused by Microdochium nivale. In Low temperature plant microbe interactions under snow. Edited by D.A. Gaudet, Tronsmo, A.M., Matsumoto, N., Yoshida, M., and Nishimune, A. Hokkaido national Agricultural Experiment Station, Japan. Turgeon, B.G., and Yoder, O.C. 2000. Proposed nomenclature for mating type genes of filamentous ascomycetes. Fungal Genetics and Biology 31: 1-5. Vaillancourt, L., Du, M., Wang, J., Rollins, J., and Hanau, R. 2000. Genetic analysis of cross fertility between two self-sterile strains of Glomerella graminicola. Mycologia 92(3): 430-435. Webster, J., and Weber, R.W.S. 2007. Introduction to Fungi, 3rd edition. Cambridge University Press, New York. Wheeler, H.E. 1954. Genetics and evolution of heterothallism in Glomerella. Phytopathology 44: 342-345. Wilken, P.M., Steenkamp, E.T., Hall, T.A., de Beer, Z.W., Wingfield, M.J., and Wingfield, B.D. 2012. Both mating types in the heterothallic fungus Ophiostoma quercus contain MAT1-1 and MAT1-2 genes. Fungal Biology 116(3): 427-437. Yun, S.-H., Arie, T., Kaneko, I., Yoder, O.C., and Turgeon, B.G. 2000. Molecular organization of mating type loci in heterothallic, homothallic, and asexual Giberella/Fusarium species. Fungal Genetics and Biology 31: 7-20.

315

Table 5.1 Published mating-type primers tested with M. nivale and M. majus

Primer Gene Primer Sequence Source Name Falpha1 CGGTCAYGAGTATCTTCCTG (Arie et al. 2000) MAT1-1-1 Falpha2 GATGTAGATGGAGGGTTCAA (Arie et al. 2000) ChHMG1 AAGGCNCCNCGYCCNATGAAC (Arie et al. 1997) MAT1-2-1 ChHMG2 CTNGGNGTGTAYTTGTAATTNGG (Arie et al. 1997)

316

Table 5.2 Isolates of M. nivale and M. majus tested with published mating-type primers

Geographic Host plant Species Isolate number Primer(s) Tested* origin origin 99061 NA W ChHMG, Fα M. majus 99049 NA W Fα 10082 UK T ChHMG, Fα 10083 UK T ChHMG, Fα 99046 NA W ChHMG, Fα 99006 NA W ChHMG, Fα M. nivale 99069 NA W ChHMG, Fα 99007 NA W ChHMG, Fα 10086 NA T ChHMG, Fα 10084 UK T ChHMG 10086 NA T ChHMG * The literature references for these primers are found in Table 5.1.

317

Table 5.3 List of species and GenBank accession numbers for genes used to design conserved

MAT1-1-1 and MAT1-2-1 primers in Microdochium sp.

Gene of interest Genus and Species GenBank Accession Number Alternaria tenuissima AY004675.1 Pleospora triglochinicola AY335167.1 Pleospora eturmiuna AY335176.1 Stemphylium majusculum AY335174.1 Pleospora tarda AY335164.1 Stemphylium callistephi AY339863.1 Pleospora paludiscirpi AY335177.1 Stemphylium solani AY339855.1 Lewia infectoria AB444188.1 Penicillium decumbens HM067979.1 Mycosphaerella musicola GU057991.1 Setosphaeria turcica GU997138.1 Ophiostoma novo-ulmi var. novo-ulmi FJ858801.1 Diaporthe melonis GQ250237.1 Paracoccidioides brasiliensis GQ411379.1 Aspergillus parasiticus EU357935.1 MAT1-1-1 Rhynchosporium secalis FJ382949.1 Mycosphaerella pini DQ915450.1 Dothistroma pini DQ915449.1 Cercospora zeina DQ264748.1 Cercospora apii DQ264736.1 Fusarium oxysporum AY527423.1 Pleospora sp. AY339862.1 Fusarium brasilicum AY452907.1 Alternaria alternaria AB468151.1 Ophiostoma novo-ulmi EU163846.1 Verticillium dahliae AB469828.1 Phaeosphaeria avenaria f.sp. triticae AY196993.1 Pyrenophora tritici-repentis AM884611.1 Neotyphodium uncinatum AB258373.1 Fusarium oxysporum f.sp. lycopersei AB011379.2 Fusarium oxysporum AY527415.1 MAT1-2-1 Xanthodactylon flammeum CAI59780

318

Gene of interest Genus and Species GenBank Accession Number Ajellomyces capsulatus ABO87595 Petromyces alliaceus ACE74241 Aspergillus flavus ACA51904 Neosartorya fischeri XP_001263957 Paracoccidioides brasiliensis EEH50039 Coccidioides immitis XP_001246635 Paracoccidioides brasiliensis ACV32366 Microsporum gypseum ACS91132 Dothistroma pini ABK91353 Magnaporthe grisea BAC65090 Cercospora zeae-maydis ABB83719 Mycosphaerella eumusae ADB11112 Glomerella lindemuthiana ABY84976 Ophiostoma novo-ulmi AAX83067 Sordaria macrospora CAA71624 Neurospora cerealis AAL28011

319

Table 5.4 Primers designed to amplify mating-type (MAT1-1 and MAT1-2) and flanking genes in Microdochium spp.

Target Sequence Primer Name Primer Sequence Gene Reference MAT1_3149F TCATRGCYTTYCGMWGTAAG MAT1_3644R GTGGTCATGATGCCNTTC MAT1_3132F AACGSCTACATGGCCTT MAT1-1 GenBank MAT1_3303R GATTGCYAARGTCTAYTCYTT MAT1_86F CGGCTTTATGGCGTTTCG MAT1_891R AAGCTATACACTGCGGCAAT MAT2_3515F AGTTCTGATGTGGACTTCTCA MAT2_4290R CAAGGGMAARCRCMCTSGT MAT2_1404F CCTYCATYCTSTACCGCAA GenBank MAT2_1811R TACASCGMCTTGYCGCA MAT2_488F AATGCCTACATTCTCTACCGC MAT2_650R ACGGACTGTGTCTTCCTCGG MAT1-2 Mn_MAT2_3347F TCGCCGCTATCCCCACT Mm99049 Mn_MAT2_3871R TGGAGTCGCTGCCCATG Mn_MAT2_20F AGCATGGCACAATGAGCC Mn11037 Mn_MAT2_727R CTGTCGTGCTGTTGTCG Mn11037, Mic_MAT2_198F CGAAGCGYGAGRCGAAG Mn12262, Mic_MAT2_676R AAGCTCTGRTCTTGAGTGT Mm99049 Mm99049 Mn_lyase_838F CGCCCAAGAACCTTCTCC Mn11037 Mn_APN2_700R GCCCAAGAACCTTCTCC APN2 Mn11037, Mn12262, Mic_APN2_32R GAACCTTCTCCCACTATCAGC Mm99049 Mn_SLA2_23R TTGTTGCCGACGTTGCGG Mm99049 Mn_SLA2_1156F CCTCCGAACCAAGTCGC Mn11037 Mn_SLA2_357F CCGCAATGTCGGCAACA SLA2 Mn11037, Mn12262, Mic_SLA2_92F GCATGCTGCGTGCCATGCACTCC Mm99049

320

Table 5.5 Number of bands amplified by the ISSR and SSR PCR primers listed with a selction of Microdochium majus isolates including the single-ascospore-derived (AS) cultures derived from the parent isolate 99049.

Primer tested Isolate BHY(AGC)5 DD(CCA)5 (CAC)5 (CT)5 (GC)5 (GAT)6 916061 924078 99049 - AS1* 8 7 0 2 4 2 5 5 99049 - AS2 8 7 1 2 0 2 5 5 99049 - AS3 8 7 1 2 4 2 5 5 99049 - AS4 8 7 1 2 4 2 5 5 99049 - AS5 8 7 1 2 4 2 5 5 99049 - AS6 8 7 1 2 4 2 5 5 99049 - AS7 8 7 1 2 3 2 5 5 99049 - AS8 8 7 1 2 3 2 5 5 99049 - AS9 8 7 0 2 3 2 5 5 99049 - AS10 8 7 1 2 3 2 5 5 99049 - parent 8 7 1 2 3 2 5 5 12043 8 7 0 0 2 2 5 5 12044 8 7 0 0 2 3 3 5 12045 8 7 1 1 1 2 5 5 12046 8 7 1 1 1 1 5 5 10098 8 7 1 0 1 0 5 5 10099 8 6 3 1 0 1 5 5 99061 4 6 0 0 0 0 5 5 Total number of 8 7 4 2 5 4 5 5 possible bands * AS = single-ascospore strain of isolate 99049

321

Table 5.6 List of species and GenBank accession numbers for genes used to search for putative

MAT1-1-1, MAT1-2-1, and flanking genes in Microdochium genomes by standalone tBLASTn

Genus and Species Gene Name GenBank Accession Number Alternaria tenuissima MAT1-1-1 AY004675.1 Pleospora triglochinicola MAT1-1-1 and MAT1-2-1 AY335167.1 Pleospora eturmiuna MAT1-1-1 and MAT1-2-1 AY335176.1 Stemphylium majusculum MAT1-1-1 and MAT1-2-1 AY335174.1 Microsporum gypseum SLA2 ACS91152.1 Microsporum gypseum APN2 ACS91139.1 Grossmania clavigera COX13 EFX05020.1 Sordaria macrospora APC5 XM_003347060.1

322

Table 5.7 List of predicted genes corresponding to the mating-type (MAT1-2-1) and flaking genes (cytoskeletal assembly protein SLA2 and DNA lyase APN2) in the Microdochium genomes studied. Comparisons were performed using standalone tBLASTn to query the gene of interest against the Microdochium spp. genomes listed.

Similarity to sequence used to Sequence Gene of Putative identify putative homolog† Genome Length Interest Homolog* Alignment Identity (bp) e-value length (nt) (%) Mm99049 g9065 3,144 3207 68.8 0.0 Mn11037 g10271 3,156 3219 68.1 0.0 Mn12262 g814 3,144 3207 68.8 0.0 SLA2 Mm10095 g10807 3,144 3207 68.8 0.0 Mn10106 g6682 3,144 3207 68.8 0.0 Mb07020 g6892 3,450 3207 69.0 0.0 Mm99049 g9066 1,860 210 34.3 2e-12 Mn11037 g10270 1,671 720 23.3 3e-11 Mn12262 g815 1,941 177 77 2e-06 MAT1-2-1 Mm10095 g10806 1,722 141 34.0 0.001 Mn10106 g6681 1,890 210 32.9 2e-08 Mb07020 g6891 1,815 222 37.8 3e-11 Mm99049 g9067 1,824 1791 46.9 2e-142 Mn11037 g10269 1,962 1917 46.3 7e-147 Mn12262 g816 1,866 1764 47.4 7e-142 APN2 Mm10095 g10805 1,839 1818 48.7 1e-150 Mn10106 g6680 2,253 1764 48.8 3e-150 Mb07020 g6890 1,833 1785 49.1 2e-156 * Based on coding sequence of predicted gene † Based on tBLASTn search against a database consisting of the genome of interest

323

Table 5.8 Results of MAT1-2-1 amplification of M. nivale isolates with the primers

Mn_MAT2_20F and Mn_MAT2_727R

Number of isolates Number of Host Plant Collection location* amplified by isolates tested MAT1-2-1 primers P. pratensis GTI (beside native green) 25 6 P. pratensis GTI (beside road) 25 21 P. annua / A. GTI (native green) 9 6 stolonifera mix L. perenne GTI (roadway) 10 8 Triticum sp. Ottawa Experimental Farm 23 13 TOTAL 92 54

324

Table 5.9 Isolates used in mating-type crosses

Experiment Species Host Plant Isolate Number 99049 M. majus wheat 12045 12166 1 wheat 12160 M. nivale 11037 turfgrass 12049 10095 12045 M. majus wheat 12166 99049 2 11029 turfgrass 11037 M. nivale 12085 12262 wheat 99077

325

Table 5.10 Summary of perithecial production in second experiment according to temperature of incubation and species included in each cross. Isolates were inoculated on sterilized wheat straw overlaid on water agar, and observations were performed after two months of incubation.

Percentage of plates with Percentage of perithecia with Number perithecia at each incubation Cross ascospores of Plates temperature 5 °C 10 °C 15 °C 20 °C 5 °C 10 °C 15 °C 20 °C M x M 10 0% 30% 60% 80% - 33% 50% 75% M x N 20 0% 0% 40% 60% - - 75% 50% N x N 15 0% 20% 0% 13% - 0% - 0%

326

Table 5.11 Species used for MAT-region synteny investigation. All species were members of the order Xylariales and family Xylariaceae

Genus and Species Genome Source Daldinia eschscholtzii Joint Genome Institute courtesy of B. Xie, Fujian Agriculture and Annulohypoxylon stygium Forestry University, China Hypoxylon sp. (CO) Joint Genome Institute Hypoxylon sp. (EC) Joint Genome Institute Pestalotiopsis neglecta courtesy of K. Watanabe, Tamagawa University, Japan Pestalotiopsis sp. courtesy of K. Watanabe, Tamagawa University, Japan

327

Table 5.12 BLASTx results for putative matches to flanking and mating genes observed in Xylariales genomes. BLAST searches

were performed by querying the putative flanking genes (cytoskeletal assembly protein SLA2, DNA lyase APN2, anaphase-promoting

complex protein APC5, and cytochrome oxidase COX) and mating type gene (MAT2) against the GenBank non-redundant database.

Query Query Species location in Top Hit Accession %ID† %Query‡ E-value Identity genome putative cytoskeleton assembly control scaffold_13 EMR72087.1 88% 98% 0 SLA2 protein sla2 protein scaffold_19 arsenite resistance EMR69908.1 97% 98% 2.00E-19 scaffold_13 Hypothetical protein (probable APN2) XP_003660213.1 78% 92% 0 APN2 Hypothetical protein (probable scaffold_55 XP_001907561.1 73% 99% 9.00E-33 cytosolic ARG-trna ligase) scaffold_38 HMG box transcriptional ELA37846.1 80% 98% 2.00E-29 Daldinia MAT2 eschscholtzii scaffold_58 HMG box protein XP_963833.2 68% 96% 1.00E-26 scaffold_13 anaphase promoting protein EMR72084.1 78% 99% 0 APC5 scaffold_42 amidophosphoribosyltransferase XP_002488025.1 64% 75% 8.00E-36 Hypothetical protein scaffold_13 (probablecytochrome c-oxidase-like XP_003651352.1 60% 99% 3.00E-30 COX protein) scaffold_27 SMAC protein XP_003348334.1 100% 98% 8.00E-33 scaffold_3 no matches found putative cytoskeleton assembly control scaffold285 EMR72087.1 94% 91% 0 protein sla2 protein SLA2 Hypothetical protein (probable Annulohypoxylon scaffold96 EJT71176.1 31% 29% 3.00E-06 stygium HET_domain containing protein) scaffold285 Hypothetical protein (probable APN2) XP_003660213.1 78% 94% 0 APN2 scaffold149 Hypothetical protein (probable wsc EJT81425.1 73% 49% 7.00E-12

328

Query Query Species location in Top Hit Accession %ID† %Query‡ E-value Identity genome domain-containing protein) scaffold71 nitrate reductse ADY76216.1 84% 99% 8.00E-90 scaffold76 nuclear export protein EMR70615.1 78% 98% 2.00E-19 scaffold28 Putative HMG box protein EMR65965.1 82% 99% 3.00E-31 MAT2 Hypothetical protein (probable HMG- scaffold60 XP_003856925.1 59% 99% 2.00E-28 box containing protein) putative anaphase-promoting complex APC5 scaffold285 EMR72084.1 78% 99% 0 protein putative cytochrome c oxidase subunit COX scaffold285 EMR72081.1 66% 99% 1.00E-34 protein scaffold_187 predicted protein (probable SLA2) XP_003053294.1 100% 95% 0 SLA2 scaffold_296 no matches scaffold_17 Hypothetical protein (probable SLA2) XP_003660213.1 83% 97% 0 APN2 scaffold_184 amine oxidase EFQ31663.1 63% 89% 8.00E-13 Hypothetical protein (probable HMG Hypoxylon sp. scaffold_107 ELR08424.1 64% 95% 1.00E-24 (CO) MAT2 box-containing protein) scaffold_85 HMG box transcriptional ELA37846.1 83% 98% 4.00E-31 putative anaphase-promoting complex APC5 scaffold_187 EMR72084.1 79% 97% 0 protein Hypothetical protein (probable COX scaffold_17 XP_003651352.1 54% 99% 1.00E-31 cytochrome c oxidase subunit protein) scaffold_80 predicted protein (probable SLA2) XP_003053294.1 100% 95% 0 SLA2 Hypothetical protein (unknown scaffold_118 CCF45826.1 76% 98% 3.00E-10 Hypoxylon sp. function) (EC) APN2 scaffold_80 Hypothetical protein (probable APN2) XP_003660213.1 75% 98% 0 Hypothetical protein (probable HMG- MAT2 scaffold_128 EGS23343.1 60% 96% 4.00E-30 box containing protein) 329

Query Query Species location in Top Hit Accession %ID† %Query‡ E-value Identity genome scaffold_18 HMG box transcriptional ELA37846.1 83% 98% 4.00E-31 APC5 scaffold_80 putative anaphase-promoting complex EMR72084.1 79% 98% 0 COX scaffold_80 cytochrome c oxidase-like protein EGS22713.1 88% 52% 1.00E-29 putative cytoskeleton assembly control scaffold_26 EMR72087.1 84% 98% 0 protein sla2 SLA2 putative tropomysin-1 alpha chain Hypoxylon sp. scaffold_47 EMR63401.1 64% 99% 5.00E-23 (CI) protein scaffold_19 fatty acid synthetase alpha subunit XP_001836417.1 48% 98% 1.00E-06 scaffold_26 hypothetical protein (probable APN2) XP_001903014.1 69% 97% 0 APN2 scaffold_19 no matches found scaffold_25 putative HMG box protein EMR65965.1 85% 99% 1.00E-31 scaffold_40 HMG box protein EFX01060.1 56% 99% 6.00E-23 MAT2 scaffold_13 predicted insulin-like growth factor 1 XP_004069709.1 42% 80% 0.49 scaffold_3 hypothetical protein EMR61840.1 68% 84% 3.00E-15 putative anaphase-promoting complex APC5 scaffold_26 EMR72084.1 76% 98% 0 protein putative cytochrome c oxidase subunit scaffold_26 EMR72081.1 89% 59% 6.00E-30 protein scaffold_77 no matches COX13 scaffold_10 no matches scaffold_4 no matches scaffold_48 putative protophyrinogen oxidase EMR69994.1 68% 93% 0.007 putative cytoskeletal assembly control pn1057 EMR72087.1 87% 95% 0 protein Pestalotiopsis SLA2 putative acetylcholoinesterase neglecta pn3470 EMR61857.1 8300% 98% 4.00E-10 precursor protein APN2 pn2907 Hypothetical protein (probable APN2) XP_001903014.1 58% 99% 0 330

Query Query Species location in Top Hit Accession %ID† %Query‡ E-value Identity genome Hypothetical protein (unknown pn2367 ELA32918.1 56% 53% 4.00E-08 function) pn2059 No matches Hypothetical protein (probable nitrate pn2175 XP_003653361.1 77% 81% 3.00E-59 reductase) pn1637 Hypothetical protein XP_001593912.1 61% 75% 5.00E-14 pn713 SIT4 phosphate-associated protein EGX95517.1 100% 51% 0.004 MAT2 Hypothetical protein (probable pn1653 XP_002838525.1 44% 46% 1.3 pyruvate decarboxylase) pn2098 No matches APC5 pn1653 anaphase poromoting complex protein EFQ29496.1 69% 95% 0

COX pn2907 Putative cytochrome c oxidase EMR72081.1 91% 64% 3.00E-21 Transmembrane actin binding-like scaffold110 EGR50054.1 93% 96% 0 SLA2 protein scaffold44 Putative arsenite resistance protein EMR69908.1 98% 98% 3.00E-10 scaffold110 Hypothetical protein (putative APN2) XP_001903014.1 66% 96% 0 Hypothetical protein (unknown scaffold52 XP_001903388.1 54% 99% 9.00E-59 function) scaffold33 Transmembrane protein ELA33651.1 67% 83% 6.7 Pestalotiopsis APN2 putative ochratoxin a non-ribosomal sp. scaffold716 EMR70362.1 72% 98% 5.00E-20 peptide synthetase protein Hypothetical protein (probable scaffold628 EFQ32338.1 95% 98% 4.00E-16 mitochrondrial carrier protein) scaffold318 HMG box transcriptional ELA37846.1 86% 98% 4.00E-32 scaffold947 hypothetical protein XP_001905098.1 58% 95% 2.00E-21 MAT2 hypothetical protein (HMG-box scaffold110 XP_003051581.1 37% 97% 2.00E-10 containing protein) 331

Query Query Species location in Top Hit Accession %ID† %Query‡ E-value Identity genome putative anaphase promoting complex scaffold110 EMR72084.1 76% 96% 0 protein APC5 scaffold99 HET domain containing protein EIW82553.1 56% 84% 5.00E-26 scaffold443 No matches found scaffold110 Putative cytochrome c oxidase subunit EMR72081.1 90% 99% 2.00E-38 COX scaffold259 Putative c6 transcription factor protein EMR66535.1 82% 98% 2.00E-15 putative cytoskeleton assembly control SLA2 17551 EMR72087.1 86% 97% 0 protein sla2 17551 hypothetical protein (probable APN2) XP_003660213.1 82% 94% 5.00E-177 hypothetical protein (unknown 17031 XP_003046908.1 69% 92% 4.00E-06 APN2 function) hypothetical protein (unknown 16665 YP_001657196.1 42% 60% 2.1 function) M. majus 99049 17541 HMG box transcriptional ELA37846.1 81% 98% 8.00E-30 hypothetical protein (HMG box 17527 EGR50539.1 40% 94% 5.00E-13 protein) MAT2 predicted protein (putative Ku-domain 17484 XP_004345286.1 38% 53% 1.4 containing protein) 17551 HMG box protein EFQ28928.1 90% 98% 8.00E-34 putative anaphase-promoting complex 17551 EMR72084.1 73% 99% 0 protein 17577 no matches found APC5 hypothetical protein (putative kinesin 232 EMD64013.1 78% 97% 0.042 light chain)

hypothetical protein (putative 16833 EGY17196.1 64% 98% 2.00E-14 superkiller protein) COX 17551 predicted protein (probable cytochrome EEH23565.1 53% 63% 2.00E-25

332

Query Query Species location in Top Hit Accession %ID† %Query‡ E-value Identity genome c protein) 17568 no matches 16727 immunoglobin lambda light chain ABA70877.1 34% 89% 1.8 17604 cytochrome c oxidase subunit EKG11568.1 43% 81% 1.2 17575 no matches found putative cytoskeletal assembly control SLA2 scaffold11 EMR72087.1 88% 97% 0 protein sla2 protein hypothetical protein (probable DNA scaffold11 XP_003660213.1 79% 97% 1.00E-176 lyase) hypothetical protein (unknown APN2 scaffold469 XP_003046908.1 69% 92% 4.00E-06 M. majus 10095 function) hypothetical protein (unknown scaffold45 YP_001657196.1 42% 60% 2 function) scaffold188 HMG box transcriptional ELA37846.1 80% 81% 3.00E-23 putative HMG box (probable MAT1-2- scaffold11 EMR65965.1 44% 75% 2.00E-06 MAT2 1) predicted protein (probable HMG box scaffold25 EGR50539.1 40% 94% 5.00E-13 protein) putative anaphase-promoting complex scaffold11 EMR72084.1 73% 99% 0 protein APC5 hypothetical protein (probable scaffold112 EGY17196.1 64% 98% 2.00E-14 superkiller protein) predicted protein (probable cotycrome scaffold11 EEH23565.1 53% 63% 2.00E-25 c oxidase) COX scaffold259 no matches immunoglobin lambda light cain scaffold40 ABA70877.1 33% 93% 0.94 variable region

333

Query Query Species location in Top Hit Accession %ID† %Query‡ E-value Identity genome scaffold302 cytochrome c oxidase EKG11568.1 43% 81% 1.2 scaffold57 no matches putative cytoskeleton assembly control 21826 EMR72087.1 86% 98% 0 protein sla2 SLA2 hypothetical protein (unknown 22728 XP_003715561.1 69% 98% 0.003 function) 22786 uncharacterized protein XP_003562413.1 65% 63% 4.2 hypothetical protein (probable DNA 21826 XP_003660213.1 65% 99% 0 lyase) APN2 22484 ap endonuclease XP_002842988.1 46% 93% 8.9 22821 extracellular serine-rich protein ELA33596.1 51% 45% 7.00E-05 M. nivale 12262 21989 HMG box transcriptional ELA37846.1 81% 98% 8.00E-30 MAT2 predicted protein (probable HMG box 22612 EGR50539.1 43% 94% 2.00E-14 protein) putative anaphase-promoting complex 21826 EMR72084.1 74% 99% 0 protein APC5 hypothetical protein (possibly related 22417 ENH79118.1 65% 89% 0.005 to APC5) hypothetical protein (probable 21826 XP_003347107.1 79% 55% 2.00E-26 cytochrome c oxidase) COX hypothetical protein (probable 22484 EME78310.1 65% 41% 0.68 monooxygenase) 22715 no matches 22664 no matches hypothetical protein (probable 22792 XP_001551465.1 100% 98% 2.00E-17 phosphatidylinositol 4-kinase) M. nivale 11037 SLA2 11541 putative cytoskeletal assembly control EMR72087.1 88% 98% 0

334

Query Query Species location in Top Hit Accession %ID† %Query‡ E-value Identity genome protein 11541 putative DNA lyase apn2 protein CCD57793.1 53% 98% 0 APN2 10889 hypothetical protein EIW83489.1 34% 8% 3 11277 no matcheshits 4647 HMG box transcriptional ELA37846.1 80% 98% 3.00E-27 MAT2 11541 putative mating type protein MAT-2 CAD59615.3 32% 99% 6.00E-08 11541 putative anaphase-promoting complex APC5 11182 no matches hypothetical protein (probable 11541 XP_003347107.1 55% 99% 1.00E-28 cytochrome c oxidase) 10911 no matches COX 11037 no matches 11448 no matches 11597 no matches putative cytoskeletal assembly control scaffold207 EMR72087.1 86% 98% 0 protein sla2 protein SLA2 hypothetical protein (unknown scaffold48 XP_003715561.1 69% 98% 0.003 function) M. nivale 10106 scaffold9 predicted protein (unknown function) XP_003562413.1 65% 63% 4.2 hypothetical protein (probable DNA scaffold207 XP_003660213.1 65% 99% 0 lyase) APN2 scaffold639 no matches scaffold320 extracellular serine-rich protein ELA33596.1 51% 45% 7.00E-05 scaffold602 HMG box transcriptional ELA37846.1 79% 76% 6.00E-21 MAT2 putative HMG protein (probable scaffold207 EMR65965.1 41% 77% 3.00E-06 MAT1-2-1)

335

Query Query Species location in Top Hit Accession %ID† %Query‡ E-value Identity genome predicted protein (probable HMG box scaffold203 EGR50539.1 43% 94% 2.00E-14 protein) putative anaphase-promoting complex scaffold207 EMR72084.1 74% 99% 0 protein APC5 hypothetical protein (unknown scaffold165 ENH79118.1 65% 89% 0.005 function) hypothetical protein (probable scaffold207 XP_964326.1 73% 27% 2.00E-25 cytochrome c oxidase) scaffold543 dimethylaniline monooxygenase XP_001267756.1 64% 89% 5.00E-42 scaffold641 no matches COX hypothetical protein (unknown scffold91 XP_001551465.1 97% 98% 2.00E-17 function) hypothetical protein (probable scaffold327 XP_001551465.1 97% 98% 2.00E-17 phosphatidylinositol 4-kinase) putative cytoskeletal assembly control SLA2 scaffold41 EMR72087.1 85% 97% 0 protein sla2 protein hypothetical protein (probable DNA scaffold41 XP_003651351.1 84% 96% 0 lyase) APN2 scaffold14 no matches scaffold28 predicted low-quality protein XP_003509871.1 54% 59% 2.6 M. bolleyi 07020 scaffold7 putative hmg box protein EMR65965.1 88% 98% 7.00E-32 MAT2 scaffold41 MAT1-1-3 EKJ71585.1 38% 80% 8.00E-09 scaffold14 HMG box protein EFQ28928.1 93% 82% 2.00E-41 putative anaphase-promoting complex APC5 scaffold41 EMR72084.1 74% 99% 0 protein hypothetical protein (probable COX scaffold41 XP_002792088.1 82% 25% 5.00E-25 cytochrome c oxidase)

336

Query Query Species location in Top Hit Accession %ID† %Query‡ E-value Identity genome nonribosomal peptide synthase, scaffold43 XP_003016838.1 61% 85% 2.00E-06 putative scaffold14 no matches scaffold7 no matches

* When the top hit was a hypothetical or putative protein, this hit was in turn searched against the BLAST database to determine a tentative identity whenever possible † Percent identity (i.e. percent of nucleotides that were identical between the two sequences) ‡ Percent of query sequence that was represented in the match

337

MAT1-2-1

Neurospora APN2 MAT1-1-3 MAT1-1-2 MAT1-1-1 SLA2 crassa

Gibberella zeae APN2 MAT1-1-3 MAT1-1-2 MAT1-1-1 MAT1-2-1 other SLA2

MAT1-2-1 MAT1-2-4

Botrytis cinerea APN2 MAT1-1-5 MAT1-1-1 SLA2

Sclerotinia sclerotiorum APN2 MAT1-1-5 MAT1-1-1 MAT1-2-1 MAT1-2-4 SLA2

5’ 3’

Figure 5.1 Orientation of the MAT1 region and flanking genes in the Sordariomycete species

Neurospora crassa (Butler et al. 2004), Giberella zeae (Yun et al. 2000), Botrytis cinerea, and

Sclerotinia sclerotorium (Amselem et al. 2001). Diagram is not to scale. A vertical bar extending over the MAT1 locus indicates that this species is heterothallic and that the gene(s) located on the parallel bars are interchangeable in the two mating types.

338

SPECIES

M. majus APC5 SLA2 MAT1-2-1 APN2 COX13

M. nivale APC5 SLA2 MAT1-2-1 APN2 COX13

M. bolleyi APC5 SLA2 MAT1-2-1 APN2 COX13

Pestalotiopsis SLA2 APC5 MAT1-2-1 COX13 APN2 sp.

Daldinia eschscholtzii COX13 APN2 SLA2 APC5 MAT1-2-1

Annulohypoxylon APC5 SLA2 // APN2 COX13 MAT1-2-1 stygium

Hypoxylon sp. APC5 SLA2 // APN2 COX13 MAT1-2-1 (EC)

Hypoxylon sp. APC5 SLA2 APN2 COX13 MAT1-2-1 (CI) //

Hypoxylon sp. APC5 SLA2 APN2 COX13 MAT1-2-1 (CO)

Figure 5.2 Orientation and synteny of the putative MAT1 region and the flanking genes APC5,

SLA2, APN2, and COX13 in several species of Xylariales, including Microdochium sp.

Diagrams are not to scale. A double slash (//) indicates a long (>10 kb) distance between putative genes found on the same scaffold, and a vertical bar (|) indicates that the gene(s) that follow was

/ were found on a different scaffold.

339

A

B

Figure 5.3 Single segment of wheat straw inoculated with M. majus isolate 99049 and M. nivale isolate 99077 incubated on water agar at 20 °C for approximately two months. Note the production of perithecia on the side closest to M. majus (A) but not M. nivale (B). 10x magnification.

340

A

B

Figure 5.4 Perithecium of M. majus isolate 99049 at (A) 40 x and (B) 100x magnification.

Perithecia depicted were observed after two months of incubation on wheat straw on water agar at 20 ºC

341

Figure 5.5 Ascospore (centre) produced by M. majus isolate 99049 at 400x magnification Spore depicted were observed after two months of incubation on wheat straw on water agar at 20 ºC.

342

99061 99049 – AS2 99049 – AS1 82 62 99049 – AS6 63 99049 – AS5 85 99049 – AS3 99049 – AS4 99049 – AS9 57 99049 – parent 39 35 99049 – AS10 99049 – AS7 99049 – AS8 10099

74 12043 34 37 12044

66 10098 78 12045 0.1 12046

Figure 5.6 Bootstrapped UPGMA tree depicting the relationships between ten single-ascospore cultures derived from M. majus isolate 99049 relative to their parent culture and DNA from seven other M. majus isolates, including one isolate collected from the same location on the same date as the parent culture (99061), two isolates from Europe (10098 and 10099), and four cultures collected from the same wheat field on the same date (12043-12046). The horizontal bar represents 10% sequence divergence. Bootstrap values are out of 100.

343

Appendices for Chapter 5

Appendix 5.1 Alignment of MAT1-1-1 sequences collected from GenBank for primer design

AM983455.1Achr ------FJ858801.1On-u GCTGACTAGATCACTCTTCAGGCCGTATTTTGAAGACACAAAAACGAGCAGTTACACGAC 60 AY527415.1Foxy ------

AM983455.1Achr ------FJ858801.1On-u AAGACCAAAGAAACGGACAAGTGTGTTGAATTATTAGATATAAATAGAAAGCTGAAAGAA 120 AY527415.1Foxy ------

AM983455.1Achr ------FJ858801.1On-u AACTGAAAGAAAGCTGAATGAAAACTGACAGAGAACTGACAGAAAAGCAACGGTAATACC 180 AY527415.1Foxy ------TT 2

AM983455.1Achr ------FJ858801.1On-u GCCATAGAAACCCCCACTTTCAGCTGGTCAGCCGCTAGTCCATATCCGGCCATTTCTGAC 240 AY527415.1Foxy GTGGTTGAACTCTTCTTTCTCAGC------CATATCATACCAGT---AAG 43

AM983455.1Achr ------FJ858801.1On-u TTATACGGCCAGACCCCTCTCTACCTCTCGACCTCTCCATTACAGGTCGCATCCAGCGCA 300 AY527415.1Foxy CTTGAATCTCAGGTGTCTCTTCATCCCACATCCTGCCGATCAC---TCTCGCTCGGTATA 100

AM983455.1Achr ------FJ858801.1On-u TAAACACAGAATAGCCACCAATATGGCCACCAACAATGTCTGTTTTCTCTTTCTCTGGCA 360 AY527415.1Foxy ------AGTTAGTATTGTCCAGCTCTCCAGGA 126

AM983455.1Achr ------FJ858801.1On-u TGAGAGCTGTCAATTTGACTATTCTTTAGTCCAGACATGACAGGTCCAACAACAAATCAT 420 AY527415.1Foxy CGA------TGTACCTACAAAGTTC 145

AM983455.1Achr ------FJ858801.1On-u ACAAAGTGAGCTGTGCAAGTTCAGCCCGTCTCGGACGAGTCGAGCAAGAAAAGGATTGGA 480 AY527415.1Foxy AGAGGCCGTGATTCC----TTCAACCTTTTTTGT------GA 177

AM983455.1Achr ------FJ858801.1On-u TATCTTGTTCGTCATGCTGACTAATGGGGCTGGCCATTTATTATCTGTTATTTGTACGCA 540 AY527415.1Foxy TCTCCTGTG------ACTTGGACTGACGATATAGAAGCCAGCAGTTCCTAGGG 224

AM983455.1Achr ------FJ858801.1On-u CTTTGAGTTTAATTTTGTATTTCAATCTATGCCATATACATTTCCATCTCTAAAATATGT 600 AY527415.1Foxy CGTGGGATTCGACACCCTCCCTCGCC------250

AM983455.1Achr ------FJ858801.1On-u TCAGCCTTGTGGTGATGAGACATTGGTCTCACTGTCTAGTATGTACACTGGAGAATGATG 660 AY527415.1Foxy ------GACGGTTGGATCTTCGTCGGAAACCTTCACTGGCGTGGATTG 292

AM983455.1Achr ------FJ858801.1On-u AGGCTCTGTGGACGAACTGGAAAAAGTAGACAGCGAAGTGGAAAAAGTGCAGCATTAAGA 720 AY527415.1Foxy G---TCTATAAATGTGTCAGAGCG-GTACCATTCGAAATGATTCCTGTGCACCAT----- 343

AM983455.1Achr ------FJ858801.1On-u CGTCGGACAACAAATGCAACGTAAGCAACAAAAACATGACGAGAATGCATTCGTACATCT 780 AY527415.1Foxy ------ACCACGAGTGAAGCAAAATCTCCCATAAGTCGAGGTATCAGGGCTCACATCTTC 397 344

AM983455.1Achr ------FJ858801.1On-u GACAACTTGGAAATGACGGTCTGTCGAATATGACTCTCATGTGACAGCTGCTGGCCCCTA 840 AY527415.1Foxy AGGAATTGGACAAAGGCGGTACTT------421

AM983455.1Achr ------FJ858801.1On-u TGCTTTCCCCTTTGTTGTCCATCTTTGTGCATGTCTTGTAAGGAGGCAAAGTATCGAGAC 900 AY527415.1Foxy -GCTTCGTGCTTTGTCATGAAAAACTCTCACGGGTTGTTGGACGCGACGACTATAATAGT 480

AM983455.1Achr ------FJ858801.1On-u AAGAGGCTCTTTCTCTCTTCCTTTCCATCTCCCGTTCTTATTTCTTGCCTGAACAAGTTG 960 AY527415.1Foxy TAGACGGTAGTGTT------GCCCAAAGATTTGGTGTCCTACCTGAATTTGTCT 528

AM983455.1Achr ------FJ858801.1On-u ATGAAAGCAGTGTAGACAGCCAGGCACGAGTCTTCTTACAGAAACAT-CATACATTAGAC 1019 AY527415.1Foxy GCAACATGGTTGATGAGCATCAGAGAAGCGTTCTCTGGGAGAAAGATGTGAATATCGGCT 588

AM983455.1Achr ------FJ858801.1On-u AAAGACGGCTCCTCGATTTTAGCAACTGCTGCAACTGCTGCATGACTCTCGAGCATTTAT 1079 AY527415.1Foxy TGCGACTGGCTGTAGACCATAGTGATGGGAACTTCGGCTTCCGGGGGCATGAGAAC---- 644

AM983455.1Achr ------FJ858801.1On-u TTGTATTGGAGATCAGAATCAACATGCACTCTGTCACTGGTTTCTGCTCGTTTCTTGATT 1139 AY527415.1Foxy ------GCCCAGAG------GACC 656

AM983455.1Achr ------FJ858801.1On-u TTAAATTAAGCCCCCACGGCTTTTGAATTGCAGTACTATCAGAAACTATAGAATATATCC 1199 AY527415.1Foxy TTCACGTAAGCGTCCTTTCGAAGTTCCAGGCATTCTTACGACAAGCTATCAATAGTCAAC 716

AM983455.1Achr ------FJ858801.1On-u GTATGAAAAGCACTATTGGCTGTCAACTAGAACTGAGACATGATATGACGAGAAGGGAAA 1259 AY527415.1Foxy GATGAGAGGGATTTTCTTGTAGGCAACTGAAATTTGAGTATTCTCTGTGGTAAAGAAATG 776

AM983455.1Achr ------FJ858801.1On-u TCGAGCACTAACGACAGAACTGAGACAGAACTAGTAAACTAAGTCGAGGGGTAC-CAAAG 1318 AY527415.1Foxy GCAAAAAGGAAGTTGATGCGCCAGCAATGATGGGGAAACTGATGCAGGTGACAAACTGAG 836

AM983455.1Achr ------FJ858801.1On-u AGAAAATAAAGGAAATTTGAAGAAATGCAATGTATGGAAATCAACTGTGCCGTGCAACGC 1378 AY527415.1Foxy GCAAGTCAAAAGTAATATGAACAA------TAGGACAGTCTGTTGAGATTGTCAATAA 888

AM983455.1Achr ------FJ858801.1On-u CGTGTGGCTACTATGCTAGCATAGTAAGCGTCTATGCAGTGATGAGAAAAATGCCAGACC 1438 AY527415.1Foxy CAAACAGTTAAGGTGTACTCTGAGCTCTATTATTCACTCGTTTGCTTATTACGTCATGTA 948

AM983455.1Achr ------FJ858801.1On-u GTGTTGCATGTGGATGCGAATGTGTAGAGTAAAAGCCAAGGAAGGTAGTAGAAAGCGTTA 1498 AY527415.1Foxy ATGTAACAAATCTGTG-AAACCTGTAGCTGCCACTGCCAGACGGCTGGCGGTGATGGTAT 1007

AM983455.1Achr ------FJ858801.1On-u TTGGGAGACAGGAAGAAGGATTATGCCCCTGACGATTCTTCTGGGCTTGCAAAAGAGAAG 1558 AY527415.1Foxy CTTTCGCACACTTCAG----TTCTATGTTTGTTTATTTTTCTTCTTTCCTACACAACACC 1063

AM983455.1Achr ------FJ858801.1On-u CAGAAACCAGGTTGTTAAT------GCAAGCGAGCGATTGCATGACATCGCTTTTGTCTG 1612 345

AY527415.1Foxy CACACTTCTCATCATCGTGTCCTCAGCCATGGACAGCTCGTTCAGCTTCAGCCCTTTATG 1123

AM983455.1Achr ------FJ858801.1On-u GTGATGGCTAGAAAGCCGTAACTTCAGGATCTAGCTGCTCCCAGTCTTCCTTTCTTGCCA 1672 AY527415.1Foxy GGAAGATCCAGCGAT---TATCTACAAGCCCGAAAAGGCTCTCGACGCCCTCCATGCCAA 1180

AM983455.1Achr ------FJ858801.1On-u GTTGCGTTGTATTTGTACCCGGGGTACATCTCTCTGTGAAGCTTCTGTGCCTCCATTGCT 1732 AY527415.1Foxy GATCCTCAGCATCATCCTGAGAAAAATCGACTTGCCCAAAGAGGGGGAGAAATTCTATCC 1240

AM983455.1Achr ------FJ858801.1On-u TTGGAGTTGTAGCAAAGGCGTGTCTCGCGGCTCTCAAGCTTCCACATCTTGGAAACAGCA 1792 AY527415.1Foxy AAAGGGTAATGGTATCGTTATTGTTTCAAGCATGACAGCAACCTAACTTTCTGAATAGAT 1300

AM983455.1Achr ------FJ858801.1On-u GTTGCT-GTTACATGTGGTTAGTCATTTCTTCAAAGAAAAATAAAGATGGTGCAGCTGTA 1851 AY527415.1Foxy GTTCTTCGCTCTGTGTTCTTCGTCATC------AACCAAGTGATGACCGACCTATC 1350

AM983455.1Achr ------FJ858801.1On-u ACTCACAAATATATCCAGCTGTAGCATGGGGG--ATTTGCTTGCGGATCTCCTTTGATTT 1909 AY527415.1Foxy CATCGATAATGAGCTCCTCAACGGAATCCGAGCTACTCATATCAGATTGGCCAGGTATGG 1410

AM983455.1Achr ------FJ858801.1On-u CTGAGACCTGTAAATGATCCAGGCATTAGGCGGGCGAGGGATGCGTGACTCGGTTGTGGT 1969 AY527415.1Foxy CTCGCCGCTCAACATCATTGATGCCGCCCTTGTGAGATGGTATACTGGAGCTGTCGTTGT 1470

AM983455.1Achr ------FJ858801.1On-u AACCGGTTCAGGGTTCTTGGCTTCTTTGTTGGCAGCTGTGTATTTATTAGTCGGAGTCTA 2029 AY527415.1Foxy CCTCTACTCCCATCAGTCTGTTCGCTCTCCAACTGGCATGC-CTCGTCATTGGATCCCAG 1529

AM983455.1Achr ------FJ858801.1On-u TATATACAAATGGTCTGTTACCTGCACTTACCTTTGGCATGGCCGATGATCTCAAAGCGG 2089 AY527415.1Foxy GCTACCGGAGCAATCACCCAATCGC---CAACTTAGGCTTTATGACCATTCTTAGAGGTT 1586

AM983455.1Achr ------FJ858801.1On-u TCAGCACACGGAGTGCTTGCATTTACCATCTGAGTCAAGCGGAATGTCCGAGACTGATGG 2149 AY527415.1Foxy TCGAGGAGTGGGCTCACCCTAAACACCCCAAACTTCAGGCTGCATCCCTGATTTCGAAGG 1646

AM983455.1Achr ------FJ858801.1On-u TCGTAGACA------ATGGAGCAATCCTCATTGGACATCGACCTGAATTATTAGAAT 2200 AY527415.1Foxy TCGCCCTCGCGATTCTGTATGCTACTTACACTATCGGGCCTCATCTCGAAGGATTCACGT 1706

AM983455.1Achr ------FJ858801.1On-u GAGTGAGTATGGCGAGGAAGATAGTTTTCACAGCTTACTGGAAGTTTCCTGCGAGGTTGC 2260 AY527415.1Foxy ---TCAACCACCTTAGCCACATGCCCATCTCTAAGTCGCGAGAGCTCTTTCTGAGGGTCT 1763

AM983455.1Achr ------FJ858801.1On-u GAAGCTGCTCGGACCCGATCATCTCCTCTGGGATGACGACAATGAATTTCTTCTCATCAT 2320 AY527415.1Foxy ------TTGTCTCCATTTCCGGCAACGTATACAATGACGACGAGGTCTTCACGACGCC 1815

AM983455.1Achr ------FJ858801.1On-u CATTGACAACAAGTGTCACTCGGCGCTCGATAGTGGTAGGCATTATGTGGTGATGATGTT 2380 AY527415.1Foxy TCCGGCCTTCGAATTCGGCGCGGCTCA------GGGCAACGTCCGGATCTGTCAGC 1865

AM983455.1Achr ------346

FJ858801.1On-u AATCGAAGATGTGGTGTTCTTGGAATTTGGAATTGATGGGATCTGATTGATGAAGTAGGG 2440 AY527415.1Foxy AAGGGAAGAAGC--TCCTCGTCAGATCCCTAAGTGAACAATTCT-----ACCGCGAAGCC 1918

AM983455.1Achr ------FJ858801.1On-u CGATAATGTTAAGTATGTCAAAGGAAGAGGGAACTTGGCGGGGTAGGAGATGGCAGAAGT 2500 AY527415.1Foxy CCTGACTGGCATCCGTATCGACGAGTACCCGGGTCCCC------ATGGAACAACT 1967

AM983455.1Achr ------FJ858801.1On-u TCATTAATATACCAAGGTGAATTGGAATCCAGAAGAACTTGCTTGTCGCACGACAACTGA 2560 AY527415.1Foxy TCATCAGAAACACGGAGCTGCCTGTCTTTTCAACCAACGAAAATCCTACGCCTCACGCGA 2027

AM983455.1Achr ------FJ858801.1On-u TCACAATGTAAGTGGCTGGAAATGATGCAAGTGATTTCAAAATTAAATTGAGAATGGAGG 2620 AY527415.1Foxy AGACCAAGTACATTCTGCCTTACAACGCGATGGTTCTCGCCGGGCAGTTCAAGGCAAAGT 2087

AM983455.1Achr ------FJ858801.1On-u ATCGACGAACCCGC--TTTAGAACAATAGTAAGAAAAACTGGGTCCATCTTTCCTTGTAA 2678 AY527415.1Foxy ACAATCTGGTCCGCGCTTTTCTTCGGCACCCACCAAACCAACTTCCTGAGCTCTCCGAGG 2147

AM983455.1Achr ------FJ858801.1On-u TTGTGTATCCTCAATGGGCTTCGCCATTGTATGAGTTGTTACAGTACAAATTTCATTATA 2738 AY527415.1Foxy AGGTAAAAACCTTGCAGCTCAAGTGGTTTGCTCAACATTCCGGGCGAGGAT---ACGCCA 2204

AM983455.1Achr ------FJ858801.1On-u GCATACTGTATGAGATGCAGGAGAAGGTGTTGAGGGTATTATATATACGGGTTTGTCTTC 2798 AY527415.1Foxy ACATCGCCCCTGATATGCA------CACAGCTGAGTCTCTCGTT 2242

AM983455.1Achr ------FJ858801.1On-u TGGCTTTAGATTTGTAGAGTTTACCATCCTTTCAAACGTCAAAGCCTTCATCATGGAGCG 2858 AY527415.1Foxy AGCGTTGGGAACGACTTCCTCTACCATACTCCTGCAATTCAAAAGCCCCGTG------2294

AM983455.1Achr ------FJ858801.1On-u CACAGAATCCGGGGCCGACAGGCTCGTTGAAACACCTTTTCATATTTTGCAAAATGCGAG 2918 AY527415.1Foxy ------CAGACCCGAGTGGGTTTCTGACTCTGCCTTTCATGTC-----AACTGCGGT 2340

AM983455.1Achr ------FJ858801.1On-u TCATCTTACCGAAACATGCCAGGGGATGTCCGGCTTGAGCTGTGGTGTGAAGGCTTTGGT 2978 AY527415.1Foxy TTATGCAATCGAGCGTCACCGGGA----TCCCGCAGAGACTCCTGTTCAAGGCCTTTTGG 2396

AM983455.1Achr ------FJ858801.1On-u TCAGTCGCTCTTTGACGACCTGGTTTCCCATGAAGAGAAACAACAAGGGTTTCTTCTTGC 3038 AY527415.1Foxy T--ATTGACCTTCAATCAAGAGCTGCCGGATGAAGTGGCCAATTAAG------2441

AM983455.1Achr ------FJ858801.1On-u TCGGCTCTATATCAAGACT-CTCGTGGAGGGGATGAGAACAAAGGACAGTAAGTCGGCAG 3097 AY527415.1Foxy ------CTATAGCAAGGTTTCTTGTAGAGTGCAAAAGGAGATATA------2480

AM983455.1Achr ------FJ858801.1On-u CCGCCGGTGTGTTAGTCCTTATGGATTCATACCCCGCGTACATCGAACGACTGGTTGCCT 3157 AY527415.1Foxy ----CGTAGTATAAGTTTCTAG------2498

AM983455.1Achr ------FJ858801.1On-u GCATTGCTATTCTTGACGATCTGATTCTCGGAGCCAATAGCCTTCTTGTGCGGTTCCAGA 3217 AY527415.1Foxy ------CTAGGTAGTCAAATCAATGCGGTCAAAACC------AGTCACAAA 2537

347

AM983455.1Achr ------FJ858801.1On-u TTCTCCAAGCCTGTATCGACGTCCCATCAACCTCAACTCTCCGCAAGTTTGTCACCGAGC 3277 AY527415.1Foxy TTCTTCGAATG------GTTCAATCGGTTTCGCCTTGGCAC------TGTCGAGC 2580

AM983455.1Achr ------FJ858801.1On-u AGAAAGCGGTATACTGCGCTTTCCGAGACAAGTACAAGAAGGAGGCAAGTCATAATTCTT 3337 AY527415.1Foxy A------TCTGCAAAACTGCCCGTAGTACGC-TAAGAAAACGGCGGCTGTCAAACACT 2631

AM983455.1Achr ------FJ858801.1On-u GCCGTTGGAGAGATGCCCCTGGTGCTGAGTTTGGCATGAAGCCCTTTGAGGAGATTGTCT 3397 AY527415.1Foxy GCCTACCACCTCATTCTATTGTTCAGGTACTTCCTACTAGGTACTTAG------2679

AM983455.1Achr ------FJ858801.1On-u ATGCCTGGGATGGAACCATCATGGTGTGGGACAATGAGTTTGACGACTGGGGATGGCTGG 3457 AY527415.1Foxy ------GTTCCTGCCAAGGAATTTATTCAT------2703

AM983455.1Achr ------FJ858801.1On-u AACACTACCACCCATGCAGATATATGCCTGGAAGTGAATGGGGGAAATTCTATCACGACT 3517 AY527415.1Foxy -ACACCCCGTCTTCTCACCCTCCAGACCTCTTACTTATTAACACCACTTCT------CT 2755

AM983455.1Achr ------FJ858801.1On-u TGACGCATGGAGATTTTTGGGTTCTTGATCAACGGCCAATCGTCGACTGGACTCATGCAG 3577 AY527415.1Foxy CCACGCTCAGAAATTCTT---CTCTCCAACCTCTGACAACTCACGCTTTAGAACTGGCAC 2812

AM983455.1Achr ------FJ858801.1On-u CCCCAGCAAACGCCATCCTCAGCCACCTAGAAACACAGTACAGGCGTGCTGACCAAACAA 3637 AY527415.1Foxy TCTTTTTATTCAGTGTTTCCCACCATCTACTTAAATCAAGCTCCTACATTCTCTCACTCG 2872

AM983455.1Achr ------FJ858801.1On-u ACAATGTCGAAGTAAGTACTATGATCCGCCGCTATGACTAGAAACACCAATAACAAAAAT 3697 AY527415.1Foxy TCGCTGTCAACATGGAAGTTCTGATC------AACGC 2903

AM983455.1Achr ------FJ858801.1On-u TGTAGGGGCTGGCTGAAAGCACGGTGAGCAGAGATACAGCAGTGAGTCCTTATCCTATCC 3757 AY527415.1Foxy TGCCAAGGATGAGCAAAAGGTCG------2926

AM983455.1Achr ------ATGACTACACGAG------CAGCC 18 FJ858801.1On-u TATTCTCGGTGATTATAATCCAACTGACAACTGCACCAGAAAAACATCCGCCTTGCAGCT 3817 AY527415.1Foxy ------CGGCC 2931 * **

AM983455.1Achr CTCGTGGAAC--GTCTTTCTGGCGTCCCAGCGACAGAGTTGT-----TGGACTTCTTGA- 70 FJ858801.1On-u TCCACGGGGCTAGCCTATCCAGT-TTGTGATGATGAGCTTCTCCAGGTGAGCTTCTTGAA 3876 AY527415.1Foxy TTGAACGA-----CGTGTCCAAGATCGCAACCATGCGTGCAGTCTCGCAAACTTCT---- 2982 * * ** * * *****

AM983455.1Achr ------CAGACGACGCCATTATC------87 FJ858801.1On-u TAATGTGAATAAAAATAGCTGAAGCTATTTTCTAACAACTTCAAGTCAACCACTTTTGAC 3936 AY527415.1Foxy ------

AM983455.1Achr ------FJ858801.1On-u ATTGATCTTAGCGATGTCACAACTAGCCAATTTGTGAGTTAAAAAATCCGGTTCTTTAAT 3996 AY527415.1Foxy ------

AM983455.1Achr ------GACCTAGC 95 FJ858801.1On-u GCATATCAGCTCACATTTCACTAACATATCAGATTCCACCACCCGAAGACATGGCCAGGC 4056 AY527415.1Foxy ------CCAAAC 2988 ** * 348

AM983455.1Achr TGCCCGATACTTCCAACGAGTCTCGGAGG-----CACAGGCAAGG------135 FJ858801.1On-u TGGATGAGCTAAACAGCAAGGTCCAGGAGGTCATCACAGGAAAGGTTCCCGAAGTAGCAA 4116 AY527415.1Foxy CGGTTGGCTGAGGCTCTATCGTCTCCTCACATGTTGAAGGAAATGC------3034 * * * *** ** *

AM983455.1Achr ------FJ858801.1On-u AAGTCCAGGAACATACTGAAAAGATCTACCGGCTTGGGCAATAGCCAAATTTTGGGGGGT 4176 AY527415.1Foxy ------

AM983455.1Achr ------FJ858801.1On-u AGATAAGATGCGAGGGGTATCGTGGCTTCTGATAGATAGAGCGAACACATACGTGAGCAG 4236 AY527415.1Foxy ------

AM983455.1Achr ------GACATTGAC------TCTTTGCCTGTTGTCGACACCGCAGCC------171 FJ858801.1On-u CTAGTCTAGTGAAACTGATAGCTTTGTTTCTGACTTTTAATGGACCCATGGCCTGAGCCT 4296 AY527415.1Foxy ------TTGAT------CTGTTCGGTCATGAGTAT------3057 *** ** * *

AM983455.1Achr ------FJ858801.1On-u AAACTTCGCATGTATCTGATGGTTAGTATCCCTGTGGGGAACTACTTTGTTGGCCAAGAG 4356 AY527415.1Foxy ------

AM983455.1Achr ------CATCTTGCCTCCTCAGATTCCG------193 FJ858801.1On-u ACATTTGACCACGCACCTTCCGTTTCAAAATCTCAACTACTGTTAATAAACATTAAAACT 4416 AY527415.1Foxy ------CTTCCTGTCTTCGAAGG------3074 * * *

AM983455.1Achr ------CCACCTTCGAA 204 FJ858801.1On-u TCTCCCACATATCTGTTTTTTTTCTCGCTCCCAAAACTTTATTCTCTTTCTACCTTCAAC 4476 AY527415.1Foxy ------CGAA 3078 * *

AM983455.1Achr ACAACACCGGA------GGCCAATTCT 225 FJ858801.1On-u ATAAGACAAGAAATCTAAAACTTGCAAACCACGACAATGGCGCCATCAACGGCTGCTACT 4536 AY527415.1Foxy ATCAAACCCCA------TGGCCCT 3096 * * ** * **

MAT1_86F: CGCCTTCATG MAT1_3132F: AACGSCTACATG MAT1_3149F: TCATR AM983455.1Achr TCTGAACCTGCCAAAGAAAAGGCAAAA------CGGCCTCTGAACGGCTTCATG 273 FJ858801.1On-u CCCGGCACACCTGATAAGCAAGGAAAGAATCCCCAGTACCGGCCGCTAAATGCATTCATG 4596 AY527415.1Foxy CTAGGTACCTCGGATTCGAGGGCTAAG------CGCCCTCTTAACGCCTTCATG 3144 * * * * * ** ** ** ** ** * ****** MAT1_86F: GCCTTTCG MAT1_3132F: GCCTT MAT1_3149F: GCYTTYCGMWGTAAG AM983455.1Achr GCCTTCCGCAGTAAGTGCCA-----CCCGTGAGCCCAATATCGAAAATCATTCTAACTGA 328 FJ858801.1On-u GCCTTCCGATGTACGTGTTAACGATCCCAGGTCCTCAAATAATAACAAAACTATAGCCTA 4656 AY527415.1Foxy GCCTTTCGCAGTAAGAGTAAACATTTGCTTGTTCATGGCACGTA------TTGA 3192 ***** ** *** * * * * * * * *

AM983455.1Achr GAGGAAACAGGCTACTATCTGAAGATATTCTCTGGTGTCCCTCAGAAATCCGCTTCCGGT 388 FJ858801.1On-u ACATTCAAAGCATTCTATAACAGGATGTTGCCTAACATGCAGCAAAAAGATCGGTCCGGG 4716 AY527415.1Foxy CCAGATCTAGCCTACTATTTGAAGCTGTTCCCCGATACCCAGCAGAAGAATGCCTCCGGT 3252 ** * **** * * * ** * * ** ** *****

MAT1_3303R: GATTGCYAAR MAT1_891R: ATTGCCGCA AM983455.1Achr TTCTTGACCATGCTCTGGCACAAGGACCCTTTCCGAAATCGATGGGCCTTGATTGCCAAG 448 FJ858801.1On-u GTTCTCACATCTCTATGGGGCATAAATCCTCACAAAAACCAATGGACTATGATGGCCAAA 4776 AY527415.1Foxy TTCCTGACCCAGCTCTGGGGCGGCGACCCTCACCGAAATAAATGGGCCCTGATTGCTAAA 3312 * * ** ** *** * * *** * *** **** * **** ** **

349

MAT1_3303R: GTCTAYTCYTT MAT1_891R: GTGTATAGCTT AM983455.1Achr GTCTACTCTTTCATTAGAGATGAAATCGGCAAGGACAAGGTCCCCCTCTCGCAGTTTCTG 508 FJ858801.1On-u GTCTACTCGTTTCTTAGGGCAGAGCTCGGGAAAGACGCCGTCCTTCTTCCGTCCTTCTTA 4836 AY527415.1Foxy GTCTATTCCTTTCTTCGCGATCAACTCGGCAAGGGTCCTGTTAACTTGTCCGCCTTCCTT 3372 ***** ** ** ** * * * **** ** * ** * * ** *

AM983455.1Achr GGTCTCTGCTGCCCTCTCATGAATATTATC------538 FJ858801.1On-u GAGCATTCTTGTGCTGTTTTAAGAATTCCCTCCCTCGATACCTATCTACAACAGCAAGGC 4896 AY527415.1Foxy GGTATCGCTTGCCCTTTGATGAGCATGGTT------3402 * ** ** * * * **

AM983455.1Achr ------FJ858801.1On-u TTTGCACTGCTCGAGAATGAGACCGGAAGTTTGAGCCTTGTGAACCAAGCCAAGCCGACT 4956 AY527415.1Foxy ------

AM983455.1Achr ------GAGCCCTCCGATT 551 FJ858801.1On-u CCATCAAATCCACTATCGTCTATTCAAGACCTCTCCGAAGACTGTGGAAGCCATGCAAGT 5016 AY527415.1Foxy ------GAACCCTCCATCT 3415 * ** * * *

AM983455.1Achr ATCTTTC------CCGCCTCGGCTG------GAATGTCGAGAAC 583 FJ858801.1On-u GTCTCTTTTGAAGACACTTCTGCGTTCACTGAGGCCCCTGGTTTTCTGAACTTCAAAAAC 5076 AY527415.1Foxy ACAT------3419 *

AM983455.1Achr GACGACAGTGGCAGT------CAGCAGCTGCTAC------611 FJ858801.1On-u AATAACCAAGAAAGCATCGATGCAGCTGTAGCTGCTTTCCAAGGCATAGGCCAACAGGAA 5136 AY527415.1Foxy ------

AM983455.1Achr ----GGCATGACAATTTCACCGACCTCGACGCAGTCGCTCTGGAGTCT------655 FJ858801.1On-u ACCAGTCTTGACGTTATGTCCGACAACGAGGAAGCAGCTGCAGGTCTTTACGCAGAGCAT 5196 AY527415.1Foxy ------CGAGGTCCTTGGCTGGGAGCAAACTGCTCC------3449 * * * * ** * ***

AM983455.1Achr ------FJ858801.1On-u GAGCTTTTGCAGGCAGTCTTTGACCGTGGCTTGGAACTTGACCCCAAGACGTTGAAGGAA 5256 AY527415.1Foxy ------

AM983455.1Achr ------FJ858801.1On-u GGGGAGACGGCCGAGAAGCAGAAGCAATGTCTAGTACAAAAGCTCTGCTCAATGGCCTAT 5316 AY527415.1Foxy ------

AM983455.1Achr ------CAAGGCCTTCCCACA---- 670 FJ858801.1On-u GAAATGCTTATCGGTGGGGAGTTTCCAGTCATCGTTGGCACCAAGGGCTTCTCTCATACC 5376 AY527415.1Foxy ------CCAAGGTATCGTGACA---- 3465 ***** * **

AM983455.1Achr ------FJ858801.1On-u ATAAAGCACAACCCCATTGCCGCAGTATCTCACTTATTCTCTCGTGAGAACAAGCTGGAC 5436 AY527415.1Foxy ------

AM983455.1Achr ------ACCGAGTTCGATCTCCTGTCCTCCAT------696 FJ858801.1On-u TTCGATGTTCTCGTTGTGGACAAGGAAGACCGAGTGAAAAATCACGTCTTTTGTCAAGTC 5496 AY527415.1Foxy ------

AM983455.1Achr ------TCTCGACATTGGTGTAATTCCCGCGAAAA------725 FJ858801.1On-u AATCCTGAAGGTTCTGACAAAGCCGTCATTCTCTCAACAACTGGACAACAGACAGTCAAT 5556 AY527415.1Foxy ------TTTCAGCAAAACACTGATATCATGAAGGC------3494 * ** * ** *

AM983455.1Achr ------GTCTCGAGCTGATCCAGCGACTCAACGTCCATAATAATGG 765 FJ858801.1On-u GCACAAACAATCGTGGGACCATTTCATACTGGCACACCGCAACAACCTCACAAGAATCGA 5616 350

AY527415.1Foxy ------GAACCTGGCACGCCTTTTCAATGTCCACCCTACCAC 3530 *** * * *** ** *

AM983455.1Achr CAGAATGGTC------TCTGACTCGAGTCGACGTCCCA------TTCCT 802 FJ858801.1On-u TCGCATGATCCGGATATGGCAGAGAATTCAACGCCAACCTACACTCAAAAAATGGCCTAT 5676 AY527415.1Foxy CGAGATCGAT------CTTCTCACCAGTATCCTGTCAG------3562 ** * * * * * MAT1_3644R: G AM983455.1Achr TGCACTATCGAAAAGCTTGATTTTCTCTGCACT------GCACAGGAA 844 FJ858801.1On-u CAGTCCCCCCAGAATGGCGGTCGTTTCCACATTTTTACCAGTCAGCCATCAACTCTCGGA 5736 AY527415.1Foxy ----CTGGCTACTTTACGGATTTCTCTCAGGTTCTCCTGATGCGCATGTGGGCTTGCCAG 3618 * * * * * * *

MAT1_3644R: AANGGCATCATGACCAC AM983455.1Achr GACCCAATCAAAGCTACCA------GGATGC 869 FJ858801.1On-u AACAGCATTATCACTAACACTAACAGCAATATTGGTGAGGCTTCCAGTCCCAGTGGATTC 5796 AY527415.1Foxy AATGGCATCATGACCACTA------3637 * ** * * * *

AM983455.1Achr TATTCGGGCAACACTACGATGAGCAAGTAATGCAGC---AGTCCGGATACCAACTTC--- 923 FJ858801.1On-u AATTCCAATGCTGTAGCAACAGGCAAGCATGACAGCCCCGAATCGTATGCCAGCTTTGAG 5856 AY527415.1Foxy ------CAAGTGCTACCGCTGGCAACAATGTCGCCACTAC--- 3671 **** * ** * * **

AM983455.1Achr ------FJ858801.1On-u GGCAGCGCCAGTGCCAGTGCTAGTGCCAGTGCCAGCTGCTCGACATCGCCCCCGGTAGAT 5916 AY527415.1Foxy ------

AM983455.1Achr ------ACGAAGTCGAAGA------936 FJ858801.1On-u CATGCCTGGAGCCAGTCACATGACGACTACAAAAACGAAGACGGCACTGAGAACAATTTG 5976 AY527415.1Foxy ------

AM983455.1Achr ------TCTCAGTCG---- 945 FJ858801.1On-u ATCGACATTAACTATGCCCAGGGCGGCGCCTCTGGCCGCCGTTTCAGTCTCAGCTGTGCC 6036 AY527415.1Foxy ------TCAACCG---- 3678 *** *

AM983455.1Achr ------CATCGGGCATCTCCCACTGCACCCACCAATGAAATG---- 981 FJ858801.1On-u TCCGATGCTGAGACCATCGTCATCACGTACCAGCCCCAGCCCCAGCCCAAGACAGGTGTG 6096 AY527415.1Foxy ------CTCTACGAGTCTGTTCCAACGACAGCCGAGAA------3710 * * * * * * ** * *

AM983455.1Achr ------FJ858801.1On-u TCGTGGTTTGGCAGCCAATTCAGCAACTCTGCCAGCTCTACGGCCGTTCTGGGGACACAT 6156 AY527415.1Foxy ------

AM983455.1Achr ------FJ858801.1On-u GGCGCGCACTTCCATCAGCAGACGCAGCAACAGCAGGCTCCGTTCCTGACCACTTCTCAT 6216 AY527415.1Foxy ------

AM983455.1Achr ------FJ858801.1On-u TCTATGCCCGTCTTACCCGCAACAAGCAACGCTGTCCAGCATACTCGGAAGCGCACCCAC 6276 AY527415.1Foxy ------

AM983455.1Achr ------GTATGACCC------990 FJ858801.1On-u GATGCCATTGAGTATGTCCCTAGCGCTAACCAAGCAAGCAAGCGTGGCCGTGGCCGTGAT 6336 AY527415.1Foxy ------AGTCAGTTTC------3720 ** * *

AM983455.1Achr ------FJ858801.1On-u AGTGGCCACTTGAGCCAGGATGCAGGCTTTCTCCTGCAAACACTCGTCCAGAACCCCGCA 6396 AY527415.1Foxy ------

351

AM983455.1Achr -----ATACGCGCAGCTCGGCCTAGACCAAC------CAATCTTCGATC------1028 FJ858801.1On-u GAAGTACAGGCACAGCTCAGCCAAGGCCAGCGCCCGCCTGTCTTTGACCAGATCTACCGC 6456 AY527415.1Foxy -----ATCAACGCAGTTC-GCGAGAGCCGCAACCTTGCTGCTCACGATC----TTTTTGG 3770 * * *** ** ** ** * ** *

AM983455.1Achr ------CCGACGCAGTTCCCGAGCACGAATGCTAC------1057 FJ858801.1On-u GCCCAAATGGCACACTCTCTTTCTCTGAGCACGGCCGCTTTTGATCCATTGCCCAGAACC 6516 AY527415.1Foxy TCCCGAATACGACGCCGCCTTTTTCGGAAATCGTTTCGTTCATTCCTG------3818 * * * * ** ** *

AM983455.1Achr ------FJ858801.1On-u ACTGTTGCTCCTGGCGCTCCCAGCGAGGATCCCATGATGCCCTCAATGCATTCCCTGCAC 6576 AY527415.1Foxy ------

AM983455.1Achr ------GATGTGGGTAACCCCTACGACATAGACAACATC------1090 FJ858801.1On-u TCTATGCATCCCATGGAGCAGATGCAGGCTACCGTTGCAGCCCAGACTACGTCAAGCCTT 6636 AY527415.1Foxy ------GGAAGTGCAGGATCTCACTTCTTTCCAGAATGT------3851 ** ** * * * ***

AM983455.1Achr -----CTCGGGTTTCC------G 1102 FJ858801.1On-u CTCAACTACAATTTCCCACAGTCCATGCCTAACATGCACGCCATCCATGCCACCCAGCAG 6696 AY527415.1Foxy ------TCAGATTTCT------GTGG 3865 * **** *

AM983455.1Achr CAGGGTGGTCTGGGGGGTGAGGCTGCTTTTCCGC------CTGCTCCC 1144 FJ858801.1On-u CACAGTAACTTGGCCTCTGTGTTCTCTTCCATGCAGAACCCGGCCGACGGTGCTGCACTC 6756 AY527415.1Foxy CGGACTCGCCTATGGAGAGCAACACTCTCTACAAC------TTCCGCAT 3908 * * * * * * * *

AM983455.1Achr CCTTTCTCGGCTC------AGGAGG 1163 FJ858801.1On-u GATCTCTCGGATCCTGCAATGCTTGATCGTTTTCTTGGAATCGCGAGTGAGAGCACGGAG 6816 AY527415.1Foxy GCCTTCGCAGTGCCT------CCCCCAG 3930 ** * * * *

AM983455.1Achr ACTTCCAGCCCATGTTCTGA------1183 FJ858801.1On-u ACTGCTGACTCTAGCTTTGGTTGGCCTTGTTGATTTATTATCTCTGTTTTCACATTTATT 6876 AY527415.1Foxy GTTTCTGAGTTCGACCTC------TACGATGTCATCGACACAAACATCATTGACATC 3981 * *

AM983455.1Achr ------FJ858801.1On-u TTTCGTTCATCTTTCACCCATCATTAGACATCTGAACATCTCACGATAGGACTATGGCTC 6936 AY527415.1Foxy TCCAGCGCCTGGTCTATTGACCAGTACCTACATGAG------4017

AM983455.1Achr ------FJ858801.1On-u GGCGTCCAAGGGATATTTTTATAAATATATGGACTGCATTTATCAACTTTGCATGTTGCC 6996 AY527415.1Foxy ------

AM983455.1Achr ------FJ858801.1On-u TTTTCTTTTCGAATTTCTATTACACTTGTTTATTTTCTGCTTGTACTAGCATTTTATGTC 7056 AY527415.1Foxy ------

AM983455.1Achr ------FJ858801.1On-u TTTCTATTCATATAATTTGTCTTGATATTT 7086 AY527415.1Foxy ------

352

Appendix 5.2 Alignment of MAT1-2-1 sequences collected from GenBank for primer design.

Sequence outside of primer design region was truncated for simplicity.

MAT2_488F: AATGCCTACATTCTCTACCGC MAT2_1404F: CCTYCATYCTSTACCGCAA AB080673.2Mgri ------AGACAAGATCCCTCGTCCTGCGAATGCCTACATCCTGTACCGCAAGGATTG 51 AY357890.1Gcin TGTTTGCAAGATCAAGGTTCCTCGACCTCCCAATGCCTTCATTCTCTACCGCAAAGATCG 60 **** * ***** *** * ******* *** ** ******** *** *

AB080673.2Mgri GCATCCCATAGTCAAGTCTGCAAATCCAGGAATCCATAACAACGAAATCTGTGAGTTCTT 111 AY357890.1Gcin CCATGCCACCATGAAGCAGGAAAACAGTCACCTAAGCAACAACGATATTTGTGAGTACTT 120 *** *** * *** * *** * ******** ** ******* ***

AB080673.2Mgri TTACCTCCAAACTTCCCT----AGCTGCTGCTAACTTTTTTTCCTTTAGCAAAAATCCTG 167 AY357890.1Gcin TGACTT---GGTTCCCCTCGGGAACAGGCGCTGACCA-----ACAACAGCCATAAGCCTA 172 * ** * * **** * * * *** ** * *** * ** ***

AB080673.2Mgri GGGAAGCAGTGGGCAGCTGAGACACCGGAAGTGCGGGCTGAGTACAAGGAGCTCGCGGAG 227 AY357890.1Gcin GGCAAGAAATGGAACAGCGAATCACCAGCCGTGCGCCAGAAGTATACCGAACTTGCAAAG 232 ** *** * *** ** **** * ***** **** * ** ** ** **

AB080673.2Mgri GAGAAGAAGAGAGAGTTCTACGCAAAGTACCCTACATATCGCTACTCTCCTCGTCGGCCG 287 AY357890.1Gcin ATGCACAAGGAGCGCCTCTTGATGATGTATCCCGACTACCGTTACACCCCTCGCAAGCCG 292 * * *** *** * *** ** ** ** *** * ***** ****

AB080673.2Mgri AGTGAGA------294 AY357890.1Gcin TCTGAGAAGCGCCATCGAAAGCCTAGTGGCCAAAGCAAGAAGACCAGCTTAGCAGCGTCG 352 *****

AB080673.2Mgri ------TAATGCGCCGA------305 AY357890.1Gcin ATGAGGTAGAGGCCGCCTCCTACCCGAGAGCGACAAAACCATCAGCGCGCCGACTTGTCG 412 * *******

MAT2_1811R: TGCGRCAAGKCGSTGTA AB080673.2Mgri ------GTGAGTT------TGTAACACATTATTGC------328 AY357890.1Gcin GCTCCAGACGGAGGAGTATGAGTTATACCTACCGACTGCGACAAGTCGCTGTAGTATCAG 472 ****** ** *** * **

AB080673.2Mgri ----GTGAAAGAGTGGACAGCGA-GCTAACAAATTACTT------362 AY357890.1Gcin TCATGGAACGGAATAGATATCGGCGCTGCCAAACGACTTTAGGTCGTCCTTGTGTTCTAG 532 * * ** * ** * ** *** **** ****

AB080673.2Mgri ------AY357890.1Gcin ACTGGCAGCATCAGCGACCAGAGGAGCCTCAGAACATGGCGCCTTTGGAGGCTGTCTTCG 592

AB080673.2Mgri ------GCCAACAAGA------372 AY357890.1Gcin ATCTCTATCCAGTCAGCGTGACCCAGGAGGCGACCACGCATCTGTTCGTCGATTCACTTC 652 * ** * **

AB080673.2Mgri ------ACACGAAAAGCCGAAAGACCCCAACG------398 AY357890.1Gcin CCGCATGCTGTGGCAAGAGTTACCTGCGCGAGAAACCAACAGACCTTTGTGCGGCGATAC 712 * *** ** ** * ***** *

AB080673.2Mgri ------AACGGGGCAGCGAAGCCGAG------418 AY357890.1Gcin TTAGACGTACGAACGCTAGGAGATCACCGGATCTGAGGAACCAAGAATCATCCTGGACCA 772 * *** * * * * ** **

AB080673.2Mgri ------CAAAGCAAAGAAGTGCGGC--AACGGGGTTACTGGTGGAGCA 458 AY357890.1Gcin TCTGCTTCGATGGCAACGCGAAGCAGAGATCTTCGATTTGATGAACATATTGGTGCAAAA 832 * ***** *** * ** * * ** ***** * *

353

AB080673.2Mgri G------459 AY357890.1Gcin ATGTACTCACAACCCATGTATTCGCATGTCAAGCGAATACGGATCTTTTTCTGTAATTAT 892

AB080673.2Mgri -----ACAGTGCAGTCGATGTCAGC------479 AY357890.1Gcin AATCTGCAAGCCAGTAAATATCATCTTCATGGACCATCTATTCCAGCCCGATACCTAGAG 952 ** **** ** *** *

354

Appendix 5.3 Alignment of SLA2 coding sequences. Note that trailing sequence has been truncated.

g10807_mm10095_SLA2 ------atggcgtccgcccgcagtctcgaccacgcaaagtccgaggctgagctggccatt g6682_mn10106_SLA2 ------atggcgtccgcccgcagtctcgaccacgcaaagtccgaggccgagctggccatc g6892_mb07020_SLA2 cccaccatggcgtccgcccgtagtctcgaccacgcaaagtccgaggccgagctggccatc g9065.t1_mm99049_sla2 ------atggcgtccgcccgcagtctcgaccacgcaaagtccgaggctgagctggccatt g814.t1_mn12262_sla2 ------atggcgtccgcccgcagtctcgaccacgcaaagtccgaggccgagctggccatc g10271.t1_mn11037_sla2 ------atggcgtccgcccgcagtctcgaccacgcaaagtccgaggccgagctggccatc ************** ************************** *********** g10807_mm10095_SLA2 aacataaaaaaggccacaagtcccgaggagtcggccccgaagcgcaagcacgtccgcagc g6682_mn10106_SLA2 aacatcaaaaaggctacgagtcccgaggagtcggccccgaagcgcaagcacgtccgcagc g6892_mb07020_SLA2 aacatcaagaaggccacgagccccgaggagtcggcgcccaagcgcaagcacgtccgcagc g9065.t1_mm99049_sla2 aacataaaaaaggccacaagtcccgaggagtcggccccgaagcgcaagcacgtccgcagc g814.t1_mn12262_sla2 aacatcaaaaaggctacgagtcccgaggagtcggccccgaagcgcaagcacgtccgcagc g10271.t1_mn11037_sla2 aatatcaaaaaagccacgagccccgaggagtcggcccccaagcgcaagcatgtccgcagc ** ** ** ** ** ** ** ************** ** *********** ********* g10807_mm10095_SLA2 tgcatcgtctacacatgggatcacaagtcctcccagtccttctgggctgggctcaaggtg g6682_mn10106_SLA2 tgcatcgtctacacatgggaccacaagtcctcccagtccttctgggctgggctcaaggtg g6892_mb07020_SLA2 tgcatcgtctacacatgggaccacaagtctgcccagtccttctgggctgggctcaaggtg g9065.t1_mm99049_sla2 tgcatcgtctacacatgggatcacaagtcctcccagtccttctgggctgggctcaaggtg g814.t1_mn12262_sla2 tgcatcgtctacacatgggaccacaagtcctcccagtccttctgggctgggctcaaggtg g10271.t1_mn11037_sla2 tgcatcgtctacacatgggaccacaagtctgcccagtccttctgggctgggctcaaggtg ******************** ******** ***************************** g10807_mm10095_SLA2 cagcccatcctcgccgacgaggtccagacttacaaggcgctcatcaccattcacaaggtc g6682_mn10106_SLA2 cagcctatcctcgccgacgaggtccagacatacaaggcgctcatcaccattcacaaagtc g6892_mb07020_SLA2 cagcccatcctcgccgatgaggtccagacgtataaggcgcttattacgatccacaaggtc g9065.t1_mm99049_sla2 cagcccatcctcgccgacgaagtccagacttacaaggcgctcatcaccattcacaaggtc g814.t1_mn12262_sla2 cagcctatcctcgccgacgaggtccagacatacaaggcgctcatcaccattcacaaagtc g10271.t1_mn11037_sla2 cagcccatcctcgccgacgaggtccagacatacaaggcgctcatcaccatccacaaggtc ***** *********** ** ******** ** ******** ** ** ** ***** *** g10807_mm10095_SLA2 ctccaagagggccacccacaaactctcagggaggcaatggccaatcggagctggatcgac g6682_mn10106_SLA2 ctgcaagagggccacccgcaaaccctcagagaggcgatggccaaccggagctggatcgac g6892_mb07020_SLA2 ctccaagagggccacccgcaaactctgagggaagcaatggccaaccggagctggatcgac g9065.t1_mm99049_sla2 ctccaagagggccacccacaaactctcagggaggcaatggccaatcggagctggatcgac g814.t1_mn12262_sla2 ctgcaagagggccacccgcaaaccctcagagaggcgatggccaaccggagctggatcgac g10271.t1_mn11037_sla2 ctccaagaggggcacccgcagactctcagagaggcaatggccaaccggagctggatcgac ** ******** ***** ** ** ** ** ** ** ******** *************** g10807_mm10095_SLA2 agcctcaacaggggcatgagcggcgagggtatgcgtggatatgcccctctcattcgggag g6682_mn10106_SLA2 agcctgaatagaggcatgagcggcgagggtatgcgtggatatgcccctctcattcgggag g6892_mb07020_SLA2 agtctgaaccgaggcatgagcggcgagggcatgcgtgggtatgctcctcttatccgggaa g9065.t1_mm99049_sla2 agcctcaacaggggcatgagcggcgagggtatgcgtggatatgcccctctcattcgggag g814.t1_mn12262_sla2 agcctgaatagaggcatgagcggcgagggtatgcgtggatatgcccctctcattcgggag g10271.t1_mn11037_sla2 agccttaacagaggcatgagcggcgagggtatgcgtggatatgctcctctcattcgggag ** ** ** * ***************** ******** ***** ***** ** ***** g10807_mm10095_SLA2 tatgtatactttctactggcgaagctctcctttcaccaacagcaccctgaattcaacggc g6682_mn10106_SLA2 tatgtatactttctgctggcgaagctctcattccaccagcagcaccctgagtttaacggc g6892_mb07020_SLA2 tatgtgtacttcctgctggcgaagctctccttccaccagcagcacccagagtttaacggc g9065.t1_mm99049_sla2 tatgtatactttctactggcgaagctctcctttcaccaacagcaccctgaattcaacggc g814.t1_mn12262_sla2 tatgtatactttctgctggcgaagctctcattccaccagcagcaccctgagtttaacggc g10271.t1_mn11037_sla2 tatgtatacttcctgctggcgaagctttcgttccaccagcagcaccccgagttcaacggc ***** ***** ** *********** ** ** ***** ******** ** ** ****** g10807_mm10095_SLA2 actttcgagtacgaagagtacgtgtcgctcaaggccacaaacgaccccaacgaggggtac g6682_mn10106_SLA2 accttcgagtacgaagagtacgtctcgctcaaggccacaaacgaccctaacgagggatat g6892_mb07020_SLA2 accttcgagtacgaagagtacgtctcgctcaaggccacaaacgaccccaacgaaggatac g9065.t1_mm99049_sla2 actttcgagtacgaagagtacgtgtcgctcaaggccacaaacgaccccaacgaggggtac g814.t1_mn12262_sla2 accttcgagtacgaagagtacgtctcgctcaaggccacaaacgaccctaacgagggatat g10271.t1_mn11037_sla2 accttcgagtacgaagagtatgtgtcgctcaaggccacaaacgaccccaatgaagggtat 355

** ***************** ** *********************** ** ** ** ** g10807_mm10095_SLA2 gaaaccatcatggatctcatgacgctgcaagataagatcgaccagttccagaagctcatc g6682_mn10106_SLA2 gagactatcatggaccttatgacgctgcaagataagatcgaccagttccagaagctcatc g6892_mb07020_SLA2 gagacgatcatggacctcatgaccctgcaagataagatcgaccagttccaaaagctcatc g9065.t1_mm99049_sla2 gaaaccatcatggatctcatgacgctgcaagataagatcgaccagttccagaagctcatc g814.t1_mn12262_sla2 gagactatcatggaccttatgacgctgcaagataagatcgaccagttccagaagctcatc g10271.t1_mn11037_sla2 gagaccatcatggacctcatgacgctgcaagacaaaatcgaccagtttcagaaactcatc ** ** ******** ** ***** ******** ** *********** ** ** ****** g10807_mm10095_SLA2 ttttcacacttccgcaacgtcggcaacaacgagtgccgtatctcttccctcgtgcccctt g6682_mn10106_SLA2 ttctcacacttccgcaatgtcggaaacaacgagtgccgcatttcctccctcgtgcccctt g6892_mb07020_SLA2 ttctcgcacttccgcaacgtcggtaacaacgagtgccgcatctcgtctctcgtgcccctc g9065.t1_mm99049_sla2 ttttcacacttccgcaacgtcggcaacaacgagtgccgtatctcttccctcgtgcccctt g814.t1_mn12262_sla2 ttctcacacttccgcaatgtcggaaacaacgagtgccgcatttcctccctcgtgcccctt g10271.t1_mn11037_sla2 ttctcacacttccgcaatgtcggcaacaacgagtgccgtatatcctccctcgtaccactc ** ** *********** ***** ************** ** ** ** ***** ** ** g10807_mm10095_SLA2 gtcgccgagacatatggcatctacaagttcatcacgagcatgctgcgtgccatgcactcc g6682_mn10106_SLA2 gtcgctgaaacatacggcatctacaagttcatcacaagcatgctgcgtgccatgcactcc g6892_mb07020_SLA2 gtcgccgagacgtacggcatctacaagttcatcacgagcatgctgcgtgccatgcactcc g9065.t1_mm99049_sla2 gtcgccgagacatatggcatctacaagttcatcacgagcatgctgcgtgccatgcactcc g814.t1_mn12262_sla2 gttgccgagacatacggcatctacaagttcatcacaagcatgctgcgtgccatgcactcc g10271.t1_mn11037_sla2 gtcgccgaaacatacggtatctacaagtttatcacgagcatgctgcgtgccatgcactcc ** ** ** ** ** ** *********** ***** ************************ g10807_mm10095_SLA2 acaactggcgatgacgaggctctcgagccgctgcgtggacgttacgacgcccagcactac g6682_mn10106_SLA2 acaactggcgatgacgaggctcttgagcctctgcgtgggcgttacgacgcccagcactac g6892_mb07020_SLA2 acgactggcgacgacgaggctcttgagcccctacgcggccgctacgacgctcagcactac g9065.t1_mm99049_sla2 acaactggcgatgacgaggctctcgagccgctgcgtggacgttacgacgcccagcactac g814.t1_mn12262_sla2 acaactggcgatgacgaggctcttgagcctctgcgtgggcgttacgacgcccagcactac g10271.t1_mn11037_sla2 acaactggcgacgacgaggctcttgagccgctgcgtggacgctacgatgctcagcactac ** ******** *********** ***** ** ** ** ** ***** ** ********* g10807_mm10095_SLA2 cgactcgtcaagttctactatgagtgctccaacttgcgctacctgacgagcttgatcact g6682_mn10106_SLA2 cggcttgtcaaattctactatgagtgctccaacttgcgctacctgaccagcttgatcacc g6892_mb07020_SLA2 agactggtcaagttctactacgagtgctccaatcttcgctaccttaccagcctgatcacg g9065.t1_mm99049_sla2 cgactcgtcaagttctactatgagtgctccaacttgcgctacctgacgagcttgatcact g814.t1_mn12262_sla2 cggcttgtcaaattctactatgagtgctccaacttgcgctacctgaccagcttgatcacc g10271.t1_mn11037_sla2 cgactcgtcaagttctattacgaatgctccaacctgcgctaccttaccagcttaattacc * ** ***** ***** ** ** ******** * ******** ** *** * ** ** g10807_mm10095_SLA2 attcccaaactgccacaggatcctccgaacctcttggccgatgatgagaatgcgccagcg g6682_mn10106_SLA2 attcccaaactgccacaggatcctccaaacctcctggctgatgatgagaatgcgccagcg g6892_mb07020_SLA2 atccccaagcttccacaagatcctccaaacttgctcgccgacgacgagaatgcgccagcg g9065.t1_mm99049_sla2 attcccaaactgccacaggatcctccgaacctcttggccgatgatgagaatgcgccagcg g814.t1_mn12262_sla2 attcccaaactgccacaggatcctccaaacctcctggctgatgatgagaatgcgccagcg g10271.t1_mn11037_sla2 attcccaaactgccacaggatcctccgaacctcctggccgaagacgagaatgcgccatcg ** ***** ** ***** ******** *** * * ** ** ** ************ ** g10807_mm10095_SLA2 cttcctgctcggcccaagcaggagatcgagcgccagccgactccggtgccgcagcccaag g6682_mn10106_SLA2 cttcctgctcgccccaagcatgagatcgagcgtcaaccgactccggtgccccagcccaag g6892_mb07020_SLA2 cttcctgcgcgtccgaagcaggagatcgagcgtcaaccaaccccggtgcagcagcccaag g9065.t1_mm99049_sla2 cttcctgctcggcccaagcaggagatcgagcgccagccgactccggtgccgcagcccaag g814.t1_mn12262_sla2 cttcctgctcgccccaagcatgagatcgagcgtcaaccgactccggtgccccagcccaag g10271.t1_mn11037_sla2 cttcctgctcgccccaaacaggagatcgagcgccagccgactccggtacagcagcctcag ******** ** ** ** ** *********** ** ** ** ***** * ***** ** g10807_mm10095_SLA2 accgatgagcccgaccaaatcgccgagttctggcaaggagaaatcgagaggcagaataag g6682_mn10106_SLA2 accgaggaacccgaccaaatcgccgagttttggcaaggagaaattgagaggcagaacaag g6892_mb07020_SLA2 aacgacgaacccgaccagattgcagagttttggcaaggagaaattgagaggcagaacaag g9065.t1_mm99049_sla2 accgatgagcccgaccaaatcgccgagttctggcaaggagaaatcgagaggcagaataag g814.t1_mn12262_sla2 accgaggaacccgaccaaatcgccgagttttggcaaggagaaattgagaggcagaacaag g10271.t1_mn11037_sla2 accgatgaacccgaccaaatcgccgagttctggcaaggagaaattgagaggcaaaacaag * *** ** ******** ** ** ***** ************** ******** ** *** g10807_mm10095_SLA2 gagtacgaggatcagcagagggtgctgcaagagcgccagcagcaatcactgctcgcacag g6682_mn10106_SLA2 gagtacgaggatcagcagagggtgttgcaggagcgccagcagcaatcactgctcgcccag g6892_mb07020_SLA2 gagtatgaagaccaacaacgggtactgcaggagcgtcagcagcaatcactgctcgctcag 356

g9065.t1_mm99049_sla2 gagtacgaggatcagcagagggtgctgcaagagcgccagcagcaatcactgctcgcacag g814.t1_mn12262_sla2 gagtacgaggatcagcagagggtgttgcaggagcgccagcagcaatcactgctcgcccag g10271.t1_mn11037_sla2 gagtacgaggatcagcagagggtgttgcaggagcgccagcagcaatcactacttgcccag ***** ** ** ** ** **** **** ***** ************** ** ** *** g10807_mm10095_SLA2 caacaagcacagatgcaggcacagcgggatttcgaagagcagcagcgccgtctagccgag g6682_mn10106_SLA2 caacaagcacagatgcaggcacagcgggatttcgaagagcagcagcgccgtctagccgag g6892_mb07020_SLA2 caacaggcacagatgcaggcacagcgggactttgaggagcagcagcgccgcctggccgag g9065.t1_mm99049_sla2 caacaagcacagatgcaggcacagcgggatttcgaagagcagcagcgccgtctagccgag g814.t1_mn12262_sla2 caacaagcacagatgcaggcacagcgggatttcgaagagcagcagcgccgtctagccgag g10271.t1_mn11037_sla2 caacaagcacagatgcaggcacagcgggatttcgaggatcaacagcgccgcctggccgag ***** *********************** ** ** ** ** ******** ** ****** g10807_mm10095_SLA2 cagcaacagcgcgaacaggaggcgctgctggcccagcaagcgcagtggcaaacgcaggga g6682_mn10106_SLA2 cagcaacagcgcgaacaggaggcgctactagcccagcaagcgcaatggcaaacacagggg g6892_mb07020_SLA2 caacaacagcgggaacaggaggcccttctcgcccagcaagctcagtggcaaacgcaggga g9065.t1_mm99049_sla2 cagcaacagcgcgaacaggaggcgctgctggcccagcaagcgcagtggcaaacgcaggga g814.t1_mn12262_sla2 cagcaacagcgcgaacaggaggcgctactagcccagcaagcgcaatggcaaacacagggg g10271.t1_mn11037_sla2 cagcaacagcgcgaacaggaggcgctccttgcccagcaagcgcaatggcacacgcaggga ** ******** *********** ** ** *********** ** ***** ** ***** g10807_mm10095_SLA2 cgtcttgcggaattggagcaggagaacctcaatgcaagagcgcagtacgagcgcgatcag g6682_mn10106_SLA2 cgtcttgcggaattggagcaggagaacctcaatgccagggcacagtacgagcgcgatcaa g6892_mb07020_SLA2 cgcctggcagagttggagcaggagaaccttaacgcaagggcacagtacgaacgggaccag g9065.t1_mm99049_sla2 cgtcttgcggaattggagcaggagaacctcaatgcaagagcgcagtacgagcgcgatcag g814.t1_mn12262_sla2 cgtcttgcggaattggagcaggagaacctcaatgctagggcacagtacgagcgcgatcaa g10271.t1_mn11037_sla2 cgtcttgcagaattggagcaagagaacctcaatgccagggcacagtacgagcgagaccag ** ** ** ** ******** ******** ** ** ** ** ******** ** ** ** g10807_mm10095_SLA2 ttaatgctgcagcagtacgaccagcgtgtaaaggcactggaaggcgaactggggcaaatc g6682_mn10106_SLA2 ttaatgctgcagcagtacgaccagcgtgtaaaggcactggaaggcgaattggggcagatc g6892_mb07020_SLA2 ctcatgctccaacagtatgatcagcgtgtcaaggctctggaaagcgagttagggcagatt g9065.t1_mm99049_sla2 ttaatgctgcagcagtacgaccagcgtgtaaaggcactggaaggcgaactggggcaaatc g814.t1_mn12262_sla2 ttaatgctgcagcagtacgaccagcgtgtaaaggcactggaaggcgaattggggcagatc g10271.t1_mn11037_sla2 ctcatgctgcagcagtacgaccagcgtgtaaaggcactggaaggcgaacttgggcagatc * ***** ** ***** ** ******** ***** ****** **** * ***** ** g10807_mm10095_SLA2 caagcgagttacggccagcagatgacaagcaaggacgatcagatccgtgcactccaagaa g6682_mn10106_SLA2 caagggagttacggccagcagatgacaagcaaggatgatcagatccgcgcactccaagaa g6892_mb07020_SLA2 caggcgagctatggacagcagatgacaagcaaggacgaccaaatccgcgctcttcaagag g9065.t1_mm99049_sla2 caagcgagttacggccagcagatgacaagcaaggacgatcagatccgtgcactccaagaa g814.t1_mn12262_sla2 caagggagttacggccagcagatgacaagcaaggatgatcagatccgcgcactccaagaa g10271.t1_mn11037_sla2 caggcgagttacggccagcagatgacaagcaaggacgatcagatccgcgcactccaagaa ** * *** ** ** ******************** ** ** ***** ** ** ***** g10807_mm10095_SLA2 caggtcaacacctggcggagcaagtacgaggcacttgcgaagctttattcgcagctgaga g6682_mn10106_SLA2 caggtgaacacctggcggagcaagtacgaggcactcgcgaagctttattcgcagctgaga g6892_mb07020_SLA2 caggtcaacacttggcggagcaagtatgaggcacttgcgaagctgtactcgcagctcagg g9065.t1_mm99049_sla2 caggtcaacacctggcggagcaagtacgaggcacttgcgaagctttattcgcagctgaga g814.t1_mn12262_sla2 caggtgaacacctggcggagcaagtacgaggcactcgcgaagctttattcgcagctgaga g10271.t1_mn11037_sla2 caggtcaacacctggaggagcaagtacgaagcactcgcgaagctttactcgcagctgcga ***** ***** *** ********** ** ***** ******** ** ******** * g10807_mm10095_SLA2 catgaacatctcgatttgctgcagaagtttaagtcggtgcaactgaaagcagcctcggcc g6682_mn10106_SLA2 catgaacatctcgacctgctgcagaagttcaagtccgtgcaattgaaagcagcctcggcc g6892_mb07020_SLA2 cacgaacaccttgatcttctgcaaaaattcaagtccgtacagctgaaagcagcatcagcc g9065.t1_mm99049_sla2 catgaacatctcgatttgctgcagaagtttaagtcggtgcaactgaaagcagcctcggcc g814.t1_mn12262_sla2 catgaacatctcgacctgctgcagaagttcaagtccgtgcaactgaaagcagcctcggcc g10271.t1_mn11037_sla2 catgaacaccttgacctactgcagaagttcaagtcggtacaattgaaggcagcctctgca ** ***** ** ** * ***** ** ** ***** ** ** **** ***** ** ** g10807_mm10095_SLA2 caggaggctatcgagcggcgagagaagctggagcgagagatcaagaccaaaaatctcgag g6682_mn10106_SLA2 caggaggctatcgagcggcgagaaaagctagagagagagatcaagactaaaaacctcgag g6892_mb07020_SLA2 caagaggctatcgagcgacgagagaagctggagcgagagatcaagaccaagaacctcgag g9065.t1_mm99049_sla2 caggaggctatcgagcggcgagagaagctggagcgagagatcaagaccaaaaatctcgag g814.t1_mn12262_sla2 caggaggctatcgagcggcgagaaaagctagagagagagatcaagactaaaaacctcgag g10271.t1_mn11037_sla2 caagaggctatcgagcggcgggagaagctggagagagagattaagaccaagaatctcgag ** ************** ** ** ***** *** ******* ***** ** ** ******

357

g10807_mm10095_SLA2 cttgctaatatgatccgcgagagagatcgcgctctgcacgaccgcgatcgcctcacaggc g6682_mn10106_SLA2 ctggctaacatgatccgcgagagagaccgcgcattgcacgaccgcgaccgcctcacaggc g6892_mb07020_SLA2 ctggctgacatgatccgcgagagagaccgggcactgcacgaccgtgaccgtctcaccggc g9065.t1_mm99049_sla2 cttgctaatatgatccgcgagagagatcgcgctctgcacgaccgcgatcgcctcacaggc g814.t1_mn12262_sla2 ctggctaacatgatccgcgagagagaccgcgcattgcacgaccgcgaccgcctcacaggc g10271.t1_mn11037_sla2 ctggctaacatgatacgcgagagggaccgggcgctgcacgaccgcgatcgcctgaccggc ** *** * ***** ******** ** ** ** ********** ** ** ** ** *** g10807_mm10095_SLA2 ggaaacaaggaagagctcgaaaagctcaagagagagctacgcatggcacttgaccgtgcc g6682_mn10106_SLA2 ggaaacaaggaagagcttgagaagctcaagagagagttgcgcatggcacttgatcgcgct g6892_mb07020_SLA2 agtaacaaggatgagcttgagaagctcaagagagagttgcgcatggcacttgaccgtgcc g9065.t1_mm99049_sla2 ggaaacaaggaagagctcgaaaagctcaagagagagctacgcatggcacttgaccgtgcc g814.t1_mn12262_sla2 ggaaacaaggaagagcttgagaagctcaagagagagttgcgcatggcacttgatcgcgct g10271.t1_mn11037_sla2 ggcaacaaagaggaacttgagaagctcaagagagagctgcgcatggcacttgatcgtgct * ***** ** ** ** ** *************** * ************** ** ** g10807_mm10095_SLA2 gacaacctcgagagagctaaaggaaacgagctctcgtccatgctgtccaagtataacaga g6682_mn10106_SLA2 gacaacctcgagagagccaaaggaaacgagctctcgtcaatgctgtccaagtacaacaga g6892_mb07020_SLA2 gacaacctcgagagggccaaaggaaatgagctttcatccatgctgtccaagtacaacagg g9065.t1_mm99049_sla2 gacaacctcgagagagctaaaggaaacgagctctcgtccatgctgtccaagtataacaga g814.t1_mn12262_sla2 gacaacctcgagagagccaaaggaaacgagctctcgtcaatgctgtccaagtacaacaga g10271.t1_mn11037_sla2 gacaacctcgagagagcaaaaggaaatgagctttcatcaatgctgtctaagtacaacaga ************** ** ******** ***** ** ** ******** ***** ***** g10807_mm10095_SLA2 gagatggctgatctggaggaggccctccgaaccaagtcgcgggcgctcgaggaagcccaa g6682_mn10106_SLA2 gagatggctgacctggaggaggcccttcgaaccaagtctcgggcgctcgaggaagcccaa g6892_mb07020_SLA2 gagatggctgacttggaggaggctctacgaaccaagtcgcgggcgcttgaagaggctcag g9065.t1_mm99049_sla2 gagatggctgatctggaggaggccctccgaaccaagtcgcgggcgctcgaggaagcccaa g814.t1_mn12262_sla2 gagatggctgacctggaggaggcccttcgaaccaagtctcgggcgctcgaggaagcccaa g10271.t1_mn11037_sla2 gagatggctgacttggaggaggccctccgcaccaagtcgcgggcacttgaggaagcccaa *********** ********** ** ** ******** ***** ** ** ** ** ** g10807_mm10095_SLA2 agcaacatgcggagcggcagctcggatctcgagcaacttctcagcgataaggaagaggag g6682_mn10106_SLA2 aacaacatgcggagcggcagctctgatctcgagcagcttcttagcgataaggaagaagag g6892_mb07020_SLA2 aacaacatgcgcagcggcagctctgatcttgagcggctgcttcaggacaaggaggaagag g9065.t1_mm99049_sla2 agcaacatgcggagcggcagctcggatctcgagcaacttctcagcgataaggaagaggag g814.t1_mn12262_sla2 aacaacatgcggagcggcagctctgatctcgagcagcttcttagcgataaggaagaagag g10271.t1_mn11037_sla2 aacaacatgcggagcggcagctcggatcttgagcaacttctcagtgacaaggaagaagag * ********* *********** ***** **** ** ** ** ***** ** *** g10807_mm10095_SLA2 cttgaggtctacaaggccagtctggatcaggcactcgtcgagctcaccacgctgagagag g6682_mn10106_SLA2 ctcgaggtctacaaggccagtttggaccaggcactcgttgagctcaccacgctgagagag g6892_mb07020_SLA2 ctagaggtctacaaggccagcttggaccaggcgctcgtcgagctcactacattgagagag g9065.t1_mm99049_sla2 cttgaggtctacaaggccagtctggatcaggcactcgtcgagctcaccacgctgagagag g814.t1_mn12262_sla2 ctcgaggtctacaaggccagtttggaccaggcactcgttgagctcaccacgctgagagag g10271.t1_mn11037_sla2 ctcgaggtctacaaggccagtctggaccaagcactcgtcgagctcaccacgctgagagag ** ***************** **** ** ** ***** ******** ** ******** g10807_mm10095_SLA2 agccaaggtgctactgatgaggcccttgactctgctctttatggcgccaacctcgataga g6682_mn10106_SLA2 agccaaggcgctactgacgaagcccttgactctgctctttacggcgcgaacctcgacaga g6892_mb07020_SLA2 agccaaggtgccactgacgaggccctcgactctgctctctatggtgcgaacctcgacaga g9065.t1_mm99049_sla2 agccaaggtgctactgatgaggcccttgactctgctctttatggcgccaacctcgataga g814.t1_mn12262_sla2 agccaaggcgctactgacgaagcccttgactctgctctttacggcgcgaacctcgacaga g10271.t1_mn11037_sla2 agccagggtgctacggacgaggccctagactccgccctttatggcgcgaacctcgacaga ***** ** ** ** ** ** ***** ***** ** ** ** ** ** ******** *** g10807_mm10095_SLA2 atcaaccacatgatcgattcggtgttggaggctggtgtggctcgtgtcgacgatgctctt g6682_mn10106_SLA2 atcaaccacatgatcgattccgtgttggaggctggtgtagctcgtgtcgacgacgctctt g6892_mb07020_SLA2 attaaccacatgatcgactctgtgctggaagctggtgtggcgcgtgtcgacgacgctctt g9065.t1_mm99049_sla2 atcaaccacatgatcgattcggtgttggaggctggtgtggctcgtgtcgacgatgctctt g814.t1_mn12262_sla2 atcaaccacatgatcgattccgtgttggaggctggtgtagctcgtgtcgacgacgctctt g10271.t1_mn11037_sla2 atcaaccacatgatcgattcggtgctggaggctggtgtggcacgtgtcgacgacgctctt ** ************** ** *** **** ******** ** *********** ****** g10807_mm10095_SLA2 tacgaactggactcaagcatgcaggctggtaaccaaaacgcctcgcccacctacgtgctg g6682_mn10106_SLA2 tacgagctggattcgagcatgcaagctggtaaccaaaacgcctcgcccacctatgtgctg g6892_mb07020_SLA2 tacgaactggactcgagcatgcaggccggtaaccagaacgcttcgccggcctatgtgctg g9065.t1_mm99049_sla2 tacgaactggactcaagcatgcaggctggtaaccaaaacgcctcgcccacctacgtgctg g814.t1_mn12262_sla2 tacgagctggattcgagcatgcaagctggtaaccaaaacgcctcgcccacctatgtgctg 358

g10271.t1_mn11037_sla2 tacgaactggactcgagcatgcaggctggcaatcaaaacgcctcgcccacatacgtgctc ***** ***** ** ******** ** ** ** ** ***** ***** * ** ***** g10807_mm10095_SLA2 tcccaaatcgagaaggcctcagccactgctacagaatttgcgaccgcatttaacgacttc g6682_mn10106_SLA2 tcccaaatcgaaaaggcctcggccactgccacagaatttgcaaccgcattcaacgacttc g6892_mb07020_SLA2 tcccagatcgagaaggcgtcggccaatgccacagaattcgccaccgcctttaacgacttc g9065.t1_mm99049_sla2 tcccaaatcgagaaggcctcagccactgctacagaatttgcgaccgcatttaacgacttc g814.t1_mn12262_sla2 tcccaaatcgaaaaggcctcggccactgccacagaatttgcaaccgcattcaacgacttc g10271.t1_mn11037_sla2 tcccaaatcgagaaggcgtcggccactgctacagaatttgcaaccgcgttcaacgacttc ***** ***** ***** ** **** *** ******** ** ***** ** ********* g10807_mm10095_SLA2 ctggccgatattcccaacgccgaccacgctaacgtcatcaaagccatcaacgtgttctct g6682_mn10106_SLA2 ctggccgacattcccaatgccgaccacgctaatgtcatcaaggctatcaacgtgttctcc g6892_mb07020_SLA2 ctggctgatatccccaatgccgatcacgccaatgttatcaagacgatcaatgtgttctct g9065.t1_mm99049_sla2 ctggccgatattcccaacgccgaccacgctaacgtcatcaaagccatcaacgtgttctct g814.t1_mn12262_sla2 ctggccgacattcccaatgccgaccacgctaatgtcatcaaggctatcaacgtgttctcc g10271.t1_mn11037_sla2 ctggccgacattcccaacgccgaccattctaacgtcatcaagaccatcaacgtgttctct ***** ** ** ***** ***** ** * ** ** ***** * ***** ******** g10807_mm10095_SLA2 ggtgctgttgcagatgtctgcagcaacaccaagggtctgacacgccttgcaactgatgac g6682_mn10106_SLA2 ggcgctattgcagatgtttgcagcaataccaagggtctgacacgcctcgcaactgatgac g6892_mb07020_SLA2 ggcgccattgctgatgtctgcagcaacaccaagggtctgacacgtctcgctacggacgac g9065.t1_mm99049_sla2 ggtgctgttgcagatgtctgcagcaacaccaagggtctgacacgccttgcaactgatgac g814.t1_mn12262_sla2 ggcgctattgcagatgtttgcagcaataccaagggtctgacacgcctcgcaactgatgac g10271.t1_mn11037_sla2 ggtgctattgcagatgtctgtagcaacaccaagggtctgacacgccttgcgactgatgac ** ** **** ***** ** ***** ***************** ** ** ** ** *** g10807_mm10095_SLA2 aagaagaccgaccagctcatgaacggtgcccgagtggcagcgcagtccgccattcgattc g6682_mn10106_SLA2 aagaagactgaccagctcatgaacggtgcccgagtagcggcgcagtccgccattcggttc g6892_mb07020_SLA2 aagaagaccgaccagctcatgaacggcgctcgagtagcggcgcaatcgaccatccgtttc g9065.t1_mm99049_sla2 aagaagaccgaccagctcatgaacggtgcccgagtggcagcgcagtccgccattcgattc g814.t1_mn12262_sla2 aagaagactgaccagctcatgaacggtgcccgagtagcggcgcagtccgccattcggttc g10271.t1_mn11037_sla2 aagaagactgaccagctcatgaatggtgcccgagtagcggcgcaatctgctattcgattc ******** ************** ** ** ***** ** ***** ** * ** ** *** g10807_mm10095_SLA2 ttcagggggctcctgagcttccagctggtagacagagaagccgaggagaagcaagacatc g6682_mn10106_SLA2 ttcaggggactcctgagcttccagctagtagacagagaagccgaagagaaacaggacatt g6892_mb07020_SLA2 ttcaggggtctcctgagcttccaattggtagatagagaggccgaggagaagcaggacgtt g9065.t1_mm99049_sla2 ttcagggggctcctgagcttccagctggtagacagagaagccgaggagaagcaagacatc g814.t1_mn12262_sla2 ttcaggggactcctgagcttccagctagtagacagagaagccgaagagaaacaggacatt g10271.t1_mn11037_sla2 ttcaggggactcctgagctttcagctggtagacagggacgctgaagagaagcaggacgtg ******** *********** ** * ***** ** ** ** ** ***** ** *** * g10807_mm10095_SLA2 gtgattaatagcaacattgatgtccagatgaacctgcaaactctgaacaagcttgtagag g6682_mn10106_SLA2 gtaatcaatagcaacattgatgtccagatgaacctgcaaactctgaacaagctcgtagag g6892_mb07020_SLA2 gtgatcaacagcaacatcgacgttcaaatgaatctgcagactctgaacaagctcgtggag g9065.t1_mm99049_sla2 gtgattaatagcaacattgatgtccagatgaacctgcaaactctgaacaagcttgtagag g814.t1_mn12262_sla2 gtaatcaatagcaacattgatgtccagatgaacctgcaaactctgaacaagctcgtagag g10271.t1_mn11037_sla2 gtgatcaacagcaacattgatgttcagatgaacctgcaaactctgaacaagctcattgag ** ** ** ******** ** ** ** ***** ***** ************** * *** g10807_mm10095_SLA2 acattcgcgcccggctttggaaagcttgctacaaacaagggcgacatcggcgacctggtt g6682_mn10106_SLA2 acattcgcgcccggcttcggaaaacttgctacaaacaagggcgacattggcgaccttgtt g6892_mb07020_SLA2 acattcgcgcctggtttcggaaagcttgccaccaataaaggcgatattggcgatctggtc g9065.t1_mm99049_sla2 acattcgcgcccggctttggaaagcttgctacaaacaagggcgacatcggcgacctggtt g814.t1_mn12262_sla2 acattcgcgcccggcttcggaaaacttgctacaaacaagggcgacattggcgaccttgtt g10271.t1_mn11037_sla2 acattcgcgcccggcttcggaaagcttgctacgaacaagggtgatatcggcgacctggtt *********** ** ** ***** ***** ** ** ** ** ** ** ***** ** ** g10807_mm10095_SLA2 gattccgagctgagtaaagcggcggatgccattgctgctgcggctgctcggcttgccaag g6682_mn10106_SLA2 gattccgagctgagtaaagcggcggatgccattgccgctgctgctgctcggctcgccaag g6892_mb07020_SLA2 gattccgagctgagtaaggcggcagatgccatcgccgccgctgctgcccggcttgccaag g9065.t1_mm99049_sla2 gattccgagctgagtaaagcggcggatgccattgctgctgcggctgctcggcttgccaag g814.t1_mn12262_sla2 gattccgagctgagtaaagcggcggatgccattgccgctgctgctgctcggctcgccaag g10271.t1_mn11037_sla2 gattccgagctaagcaaggcggcggatgctatcgccgccgctgctgctcggctcgccaag *********** ** ** ***** ***** ** ** ** ** ***** ***** ****** g10807_mm10095_SLA2 ctcagaaacaagccacgcgacaagtactcgacctacgagctcaaggttcacgactcgatc g6682_mn10106_SLA2 ctcagaaacaagccgcgcgacaaatactcgacctacgagctcaaggttcacgactcaatc 359

g6892_mb07020_SLA2 ctcaagaacaagccgcgcgacaagtattcaacctacgagctcaaggttcatgactcgatt g9065.t1_mm99049_sla2 ctcagaaacaagccacgcgacaagtactcgacctacgagctcaaggttcacgactcgatc g814.t1_mn12262_sla2 ctcagaaacaagccgcgcgacaaatactcgacctacgagctcaaggttcacgactcaatc g10271.t1_mn11037_sla2 ctcagaaacaagccgcgcgacaaatactcaacctacgagctcaaggtccacgactcgatc **** ******** ******** ** ** ***************** ** ***** ** g10807_mm10095_SLA2 ttggatgctgccctggccatcacgaacgctatcgccaagctcattaaagcagctacagtc g6682_mn10106_SLA2 ttggatgctgccctggccatcacgaacgctatcgccaggctcatcaaggcagccacggtc g6892_mb07020_SLA2 ctggatgcggccatggccatcacgaacgctatcgccaggctcatcaaggctgccacagtc g9065.t1_mm99049_sla2 ttggatgctgccctggccatcacgaacgctatcgccaagctcattaaagcagctacagtc g814.t1_mn12262_sla2 ttggatgctgccctggccatcacgaacgctatcgccaggctcatcaaggcagccacggtc g10271.t1_mn11037_sla2 ttggatgctgctctagccatcacaaacgccatcgccagactaattaaagccgctacagtg ******* ** * ******** ***** ******* ** ** ** ** ** ** ** g10807_mm10095_SLA2 acccagcaagaaattgtgcaggctggcagaggatcatcctcgaggactgcgttctacaag g6682_mn10106_SLA2 acgcagcaggaaatcgtgcaggctggcagaggatcatcctcgaggactgcgttctacaag g6892_mb07020_SLA2 actcaacaggagattgtgcaggctggcaggggatcatcctccaggactgcgttctacaag g9065.t1_mm99049_sla2 acccagcaagaaattgtgcaggctggcagaggatcatcctcgaggactgcgttctacaag g814.t1_mn12262_sla2 acgcagcaggaaatcgtgcaggctggcagaggatcatcctcgaggactgcgttctacaag g10271.t1_mn11037_sla2 acccagcaggaaattgtgcaggctggcagaggatcgtcctcgaggactgcgttttacaag ** ** ** ** ** ************** ***** ***** *********** ****** g10807_mm10095_SLA2 aagaataatcgttggaccgagggtctcatctcggcggccaaggccgtggcctcttcgacc g6682_mn10106_SLA2 aagaacaaccgttggaccgagggtctcatctcggcggccaaggccgtggcttcttcgacc g6892_mb07020_SLA2 aagaacaaccgctggaccgaaggtcttatttcggctgcaaaggctgtggccacgtcgacc g9065.t1_mm99049_sla2 aagaataatcgttggaccgagggtctcatctcggcggccaaggccgtggcctcttcgacc g814.t1_mn12262_sla2 aagaacaaccgttggaccgagggtctcatctcggcggccaaggccgtggcttcttcgacc g10271.t1_mn11037_sla2 aagaacaatcgttggaccgagggtctcatctcggcagccaaggccgtagcctcttcgacc ***** ** ** ******** ***** ** ***** ** ***** ** ** * ****** g10807_mm10095_SLA2 aatactctcattgagaccgccgacggtgtgctttctaaccgcaacagccccgagcagctg g6682_mn10106_SLA2 aacactcttattgagaccgccgacggtgtgctttctaaccgcaacagccccgaacagctg g6892_mb07020_SLA2 aacacccttatcgagactgctgacggtgtcctgtcaaaccgcaacagtcccgagcagcta g9065.t1_mm99049_sla2 aatactctcattgagaccgccgacggtgtgctttctaaccgcaacagccccgagcagctg g814.t1_mn12262_sla2 aacactcttattgagaccgccgacggtgtgctttctaaccgcaacagccccgaacagctg g10271.t1_mn11037_sla2 aacacattaatcgagaccgccgatggtgtgctgtctaaccgtaacagccccgagcagttg ** ** * ** ***** ** ** ***** ** ** ***** ***** ***** *** * g10807_mm10095_SLA2 atcgtggcatccaacaacgttgctgcctccacagcccagcttgtcgctgccagccgtgtc g6682_mn10106_SLA2 atcgtggcgtccaacaacgttgctgcctccacagcccagcttgtcgctgccagccgcgtc g6892_mb07020_SLA2 attgttgcatccaatgacgtcgctgcctctacagcccagcttgtcgccgccagtcgtgtc g9065.t1_mm99049_sla2 atcgtggcatccaacaacgttgctgcctccacagcccagcttgtcgctgccagccgtgtc g814.t1_mn12262_sla2 atcgtggcgtccaacaacgttgctgcctccacagcccagcttgtcgctgccagccgcgtc g10271.t1_mn11037_sla2 attgtcgcgtccaacaacgttgctgcctccacagcgcaacttgtcgctgccagccgtgtc ** ** ** ***** **** ******** ***** ** ******** ***** ** *** g10807_mm10095_SLA2 aaggctggcttcatgagccagaaccaggatgatctggagcaggctagcaaggctgttggt g6682_mn10106_SLA2 aaggctggcttcatgagccagaaccaggacgacctggagcaggccagcaaggctgttggt g6892_mb07020_SLA2 aaggccggcttcatgagcaagaaccaggacgacctggagcaggccagcaaggctgtcggt g9065.t1_mm99049_sla2 aaggctggcttcatgagccagaaccaggatgatctggagcaggctagcaaggctgttggt g814.t1_mn12262_sla2 aaggctggcttcatgagccagaaccaggacgacctggagcaggccagcaaggctgttggt g10271.t1_mn11037_sla2 aaggctggcttcatgagtcagaaccaggacgacctggagcaggccagcaaggccgttggc ***** *********** ********** ** *********** ******** ** ** g10807_mm10095_SLA2 gccgcttgccgttccctggtccggcaggtacaggcgcttatcaaggagcgctcgaacgag g6682_mn10106_SLA2 gccgcttgccgttccctggtccggcaggtgcaagcgcttatcaaggagcgctcgaacgag g6892_mb07020_SLA2 gccgcctgccgtgccttggtccgacaagtacaggcgctcatcaaggagcgctcgaacgag g9065.t1_mm99049_sla2 gccgcttgccgttccctggtccggcaggtacaggcgcttatcaaggagcgctcgaacgag g814.t1_mn12262_sla2 gccgcttgccgttccctggtccggcaggtgcaagcgcttatcaaggagcgctcgaacgag g10271.t1_mn11037_sla2 gctgcttgccgctccctggtccggcaggtgcaggcgcttatcaaggagcgctcgaacgag ** ** ***** ** ******* ** ** ** ***** ********************* g10807_mm10095_SLA2 gaggactcagtggactatggagctcttggcgcgcacgagttcaaggtgcgagaaatggag g6682_mn10106_SLA2 gaggactcggtggactatggagctcttggcgcgcacgagtttaaggtgcgggaaatggag g6892_mb07020_SLA2 gaggactcggtggactatggtgcgcttggtgcgcacgagttcaaggtgcgagagatggag g9065.t1_mm99049_sla2 gaggactcagtggactatggagctcttggcgcgcacgagttcaaggtgcgagaaatggag g814.t1_mn12262_sla2 gaggactcggtggactatggagctcttggcgcgcacgagtttaaggtgcgggaaatggag g10271.t1_mn11037_sla2 gaggactcggtagactatggagctcttggtgcgcacgagttcaaggttcgggaaatggag ******** ** ******** ** ***** *********** ***** ** ** ****** 360

g10807_mm10095_SLA2 caac------aggtcgaaatcctcaagcttgagaactcgttgtccgcggcaaga g6682_mn10106_SLA2 caac------aggtcgaaatcctcaagcttgagaactcgttgtccgcggcaaga g6892_mb07020_SLA2 caac------aggtcgaaatcctcaagcttgagaactcgttgtctgcggcaaga g9065.t1_mm99049_sla2 caac------aggtcgaaatcctcaagcttgagaactcgttgtccgcggcaaga g814.t1_mn12262_sla2 caac------aggtcgaaatcctcaagcttgagaactcgttgtccgcggcaaga g10271.t1_mn11037_sla2 caacagtttctgtcccaggttgaaatcctcaagctcgagaactcgttatccgcggcaaga **** **** ************** *********** ** ********* g10807_mm10095_SLA2 cacaggctgggcgagatgcgcaagatctcataccaggaggag g6682_mn10106_SLA2 cacaggttgggtgagatgcgtaagatctcataccaggaggag g6892_mb07020_SLA2 cgcaggctgggcgagatgcgcaagatctcgtaccaggaggag g9065.t1_mm99049_sla2 cacaggctgggcgagatgcgcaagatctcataccaggaggag g814.t1_mn12262_sla2 cacaggttgggtgagatgcgtaagatctcataccaggaggag g10271.t1_mn11037_sla2 cacaggttgggcgagatgcgcaagatctcataccaggaggag * **** **** ******** ******** ************

361

Appendix 5.4 Alignment of MAT1-2-1 coding sequences g10806_mm10095_MAT2 atgcag------cccaactaccacggcggtgcctttggtgccaatggcgacggttcggga g6681_mn10106_MAT2 atgcag------cccaataaccaaggcggtgccttcggtgccaatggcggcggttcagga g6891_mb07020_MAT2 atgcagccgagcttcccaggcggaggcggagccttcagtgacaatggcggctgctctgga g9066.t1_mm99049_mat2 atgcag------cccaactaccacggcggtgcctttggtgccaatggcgacggttcggga g815.t1_mn12262_mat2 atgcag------cccaataaccaaggcggtgccttcggtgccaatggcggcggttcagga g10270.t1_mn11037_mat2 atgcag------ccaagcactccaggtggtgccttcggtaccaatggcagcggctctgga ****** ** ** ***** ** ******* * * ** *** g10806_mm10095_MAT2 ggatacggccctccagttcaagcgtcaggtctg------caa g6681_mn10106_MAT2 ggatacggccctccggccccagctccaggtctg------cca g6891_mb07020_MAT2 gggttcagcccgccgaaccagattcctagtatgccacaacctggagcagccccatcaagt g9066.t1_mm99049_mat2 ggatacggccctccagttcaagcgtcaggtctg------caa g815.t1_mn12262_mat2 ggatacggccctccggccccagctccaggtctg------cca g10270.t1_mn11037_mat2 ggatacggcctcccggttccagcaccaggtctg------cca ** * * *** ** * * ** ** g10806_mm10095_MAT2 cagccgactggt---gtcgtccgcaacccgactctcgtccgcctgctccatgagtggcag g6681_mn10106_MAT2 cagccggctggt---gtcgtccgcaatccggccctggtccgcctgctccacgagtggtat g6891_mb07020_MAT2 catcagccattcgaactcgccctcaatcagaatctcgtgcgattgcttcgcgagtggcag g9066.t1_mm99049_mat2 cagccgactggt---gtcgtccgcaacccgactctcgtccgcctgctccatgagtggcag g815.t1_mn12262_mat2 cagccggctggt---gtcgtccgcaatccggccctggtccgcctgctccacgagtggtat g10270.t1_mn11037_mat2 cagccctctggtattgccgaccgcaacccggctctcttccgcctgctccacgagtggcag ** * * ** ** *** * * ** * ** **** * ****** * g10806_mm10095_MAT2 tactgccagcagcttcagcgtcctgcagtcgacattgtctttatccccagctcaatcttt g6681_mn10106_MAT2 tactgccaacagcttcagcgtcccgcagtcgatgttgtctgcatccctagctcaatcttt g6891_mb07020_MAT2 tactgccagcaacttccgcgccctgtggttgacattgtctgtattcccacttcgatctac g9066.t1_mm99049_mat2 tactgccagcagcttcagcgtcctgcagtcgacattgtctttatccccagctcaatcttt g815.t1_mn12262_mat2 tactgccaacagcttcagcgtcccgcagtcgatgttgtctgcatccctagctcaatcttt g10270.t1_mn11037_mat2 tactgtcagcagcttcaacgtcccacattcgacgttgtctgcatccccagctcaatcttt ***** ** ** **** ** ** * ** ****** ** ** * ** **** g10806_mm10095_MAT2 gaccgctggtctgctcaggcaaggaacttgattaggcaactacacggtgcttccacccgg g6681_mn10106_MAT2 gaccgctggtctgctcaggcaaagaacttggtccgacaactacacggtgcttcaacccgg g6891_mb07020_MAT2 gacagctggactgctcaggcaaagatcttggttcggcaactgcatggtgcttccactcgc g9066.t1_mm99049_mat2 gaccgctggtctgctcaggcaaggaacttgattaggcaactacacggtgcttccacccgg g815.t1_mn12262_mat2 gaccgctggtctgctcaggcaaagaacttggtccgacaactacacggtgcttcaacccgg g10270.t1_mn11037_mat2 gaccgctggtcgccccaggcaaaaaatatggttcggcaaatgcacggtgccaccacccgc *** ***** * * ******* * ** * * *** * ** ***** * ** ** g10806_mm10095_MAT2 aaggatgtggtcttctgttttgactcctacagctcgggtcatgtgtatctgggcgctttg g6681_mn10106_MAT2 aaggatgcggtcttctgttttgactcctacagctcgggccatgtgtacctgggggctttg g6891_mb07020_MAT2 aaggatgtcgtctactgcttcgattcctacatcgcgggtcgtatgtacatcggagcgctc g9066.t1_mm99049_mat2 aaggatgtggtcttctgttttgactcctacagctcgggtcatgtgtatctgggcgctttg g815.t1_mn12262_mat2 aaggatgcggtcttctgttttgactcctacagctcgggccatgtgtacctgggggctttg g10270.t1_mn11037_mat2 aaggatgtggtttactgttttgactcttatgattcgggccgtatatacctggggtcccta ******* ** * *** ** ** ** ** **** * * * ** * ** * * g10806_mm10095_MAT2 atggatttcattgtagctggctactggatccatcaaatggccggaagcagcatgccagca g6681_mn10106_MAT2 atggatttcattgtagctggctactggatccatcaaatggccggaagcagcatgccagcc g6891_mb07020_MAT2 atggacttcatctcggccggatactggatccatcaagagccaggaagcagcatgccagca g9066.t1_mm99049_mat2 atggatttcattgtagctggctactggatccatcaaatggccggaagcagcatgccagca g815.t1_mn12262_mat2 atggatttcattgtagctggctactggatccatcaaatggccggaagcagcatgccagcc g10270.t1_mn11037_mat2 atggatttcatctcggccggctactggatccatcaaatgcctggaagcagaatgccagca ***** ***** ** ** *************** * * ******** ******** g10806_mm10095_MAT2 gtgggcttgaccgctcagcagtcgggcttgcatgttccctctcaatcgtcctctgctctt g6681_mn10106_MAT2 gtgggcttgattcctcagcagttgggcgtgcctgttcactctcaattgcaatctgttctt g6891_mb07020_MAT2 gtaggcttggtgcatcagcaatcacgcgtgcctgtgccatcgcaacagtcctctgtcggt g9066.t1_mm99049_mat2 gtgggcttgaccgctcagcagtcgggcttgcatgttccctctcaatcgtcctctgctctt g815.t1_mn12262_mat2 gtgggcttgattcctcagcagttgggcgtgcctgttcactctcaattgcaatctgttctt g10270.t1_mn11037_mat2 gtgggtttgatctctcaggagccgggtttggctgcgccttctcaaccgtcttctactttt ** ** *** **** * * ** ** * ** *** * *** * g10806_mm10095_MAT2 ggacagtcggtccaggcttctgcttccacagtcttcacagctccttctgctctccaacct g6681_mn10106_MAT2 ggacagtcggtccaggcttctgcttctacagtcttccctgctcccgctgttctccaacct 362

g6891_mb07020_MAT2 acccagtcgacgcgggatcctatggaaccactccctg---gtacagctgccatgtcgcct g9066.t1_mm99049_mat2 ggacagtcggtccaggcttctgcttccacagtcttcacagctccttctgctctccaacct g815.t1_mn12262_mat2 ggacagtcggtccaggcttctgcttctacagtcttccctgctcccgctgttctccaacct g10270.t1_mn11037_mat2 gatcgtttggttcagtttcctgtctccacagttcctg---cttccacggttttgcagcct * * * * * * ** ** * * * * * * *** g10806_mm10095_MAT2 tcgaatgcggcacaaaatccacctccacctcaagctgaatctggtcatcagaccaactcg g6681_mn10106_MAT2 tcgagtgctgcacagaatccacctccaactcaagctggatccattcatctgaccaactcg g6891_mb07020_MAT2 tccagttccgctcaggtccgtcctcagcctgcagccgaagccagccaccagaatgcctcg g9066.t1_mm99049_mat2 tcgaatgcggcacaaaatccacctccacctcaagctgaatctggtcatcagaccaactcg g815.t1_mn12262_mat2 tcgagtgctgcacagaatccacctccaactcaagctggatccattcatctgaccaactcg g10270.t1_mn11037_mat2 tccaatgctgcacagggaccatatcagccacagtttgattccagtcatcaaatcatctct ** * * * ** ** * ** * * * ** * * *** g10806_mm10095_MAT2 actcttgatgctgatctacagctccagagcacaactcgtc------ccagcacccct g6681_mn10106_MAT2 actcaatcggccgatccccagcttcagagcacgactcgtcctagcagtcccagcacccct g6891_mb07020_MAT2 acctcggcaggcagttcttggcctcagaacacgactcgtcc------c g9066.t1_mm99049_mat2 actcttgatgctgatctacagctccagagcacaactcgtc------ccagcacccct g815.t1_mn12262_mat2 actcaatcggccgatccccagcttcagagcacgactcgtcctagcagtcccagcacccct g10270.t1_mn11037_mat2 acttcagcagccga------** * g10806_mm10095_MAT2 acggagcaaagtgcttctcgcaagagaccgatcgcagacatctcgggaagccaggaagtt g6681_mn10106_MAT2 actaatcaaagtgcttctcgcaagagaccgatcgtagacatctcaggaagccaggaaaat g6891_mb07020_MAT2 gccgagcaaggcacttctcgcaagagagcaatcatagctctctcggacattcaggactct g9066.t1_mm99049_mat2 acggagcaaagtgcttctcgcaagagaccgatcgcagacatctcgggaagccaggaagtt g815.t1_mn12262_mat2 actaatcaaagtgcttctcgcaagagaccgatcgtagacatctcaggaagccaggaagtt g10270.t1_mn11037_mat2 ------

g10806_mm10095_MAT2 tcagatccggacccaggctttgttggcgatggcag------cgcccagc------g6681_mn10106_MAT2 gccgatgaggg------tgtatccgcgaccagt g6891_mb07020_MAT2 ac------g9066.t1_mm99049_mat2 tcagatccggacccaggctttaatgccgacgaggg------tgtatccgcgaccagt g815.t1_mn12262_mat2 tcaggccctaacccaggctctgacgggaacggtagcgctcaaccaagcggccctcctgct g10270.t1_mn11037_mat2 ------

g10806_mm10095_MAT2 ------g6681_mn10106_MAT2 gagggcgatgtcgagcacataccagtggcggatactggttcggtaggaagctcgaccgaa g6891_mb07020_MAT2 ------ttcagcacaagtatcgactgag g9066.t1_mm99049_mat2 gagggcaacgtcgagcaaacgccagtggcgggtactggttcggtaaaaggctcgatcgaa g815.t1_mn12262_mat2 gagggcgatgtcgagcacataccagtggcggatactggttcggtaggaagctcgaccgaa g10270.t1_mn11037_mat2 ------tactgagccccaaagcacaact

g10806_mm10095_MAT2 ------g6681_mn10106_MAT2 aggcctgctaagaaggtcaagagggcaaaaaagccaaagaacaacgatggcgttcctcgt g6891_mb07020_MAT2 agacccgacaaga------aggccaagaagaccaaggttcctcgc g9066.t1_mm99049_mat2 aaacctgccaagaagatcaaaagggcaaaaaatccaaagagaaacgatggcgttcctcgc g815.t1_mn12262_mat2 aggcctgctaagaaggtcaagagggcaaaaaagccaaagaacaacgatggcgttcctcgt g10270.t1_mn11037_mat2 cggcctgccaagagggtgaagagggccaagagagccaagaacaacaatggtgttccccgt

g10806_mm10095_MAT2 ------caagcagccctcctacaactgcccaccgtatggagtttgccgagaaggag g6681_mn10106_MAT2 ccttcgaactcttggatcctgtatagaacagctcatcgtatggagtttgcagagaaggag g6891_mb07020_MAT2 cccgcgaatgcctggatactctacagaagagctcaacggccagattttgccaagcaacat g9066.t1_mm99049_mat2 ccttcgaactcttggattctgtacagaactgcccaccgtatggagtttgccgagaaggag g815.t1_mn12262_mat2 ccttcgaactcttggatcctgtatagaacagctcatcgtatggagtttgcagagaaggag g10270.t1_mn11037_mat2 ccttcaaatgcttggatcctgtacagaacagctcgccgtagggattttgccgacgtcgct * * * * * ** ** * ** ** ***** * g10806_mm10095_MAT2 cccgaaatggacaactgcagcctctc------gaaagttattgccaaa g6681_mn10106_MAT2 ccgacattggacaactgcagcctctc------gaaagttattgccaaa g6891_mb07020_MAT2 ccgggcgcaagtgaaggcgagctctc------aaccttcatctctgca g9066.t1_mm99049_mat2 cccgaaatggacaactgcagcctctc------gaaagttattgccaaa g815.t1_mn12262_mat2 ccgacattggacaactgcagcctctctaactcgttcccgttagcgaaagttattgccaaa g10270.t1_mn11037_mat2 cccagtacagacaactgcaacctctc------aaaaattattgcccaa ** * ** ***** * * ** * * 363

g10806_mm10095_MAT2 gcatggcacaaggaacccgcagatgtcaaggcgtactggaaacagaaagaaaaggaagtc g6681_mn10106_MAT2 gcatggcacaaggagcccgctgatgtcaaggcgtactggaagcagaaagaaaaggaagtc g6891_mb07020_MAT2 gcatggcgggcagagtctactgaggtgcggacatactggaagcagaaagagaaggaacag g9066.t1_mm99049_mat2 gcatggcacaaggaacccgcagatgtcaaggcgtactggaaacagaaagaaaaggaagtc g815.t1_mn12262_mat2 gcatggcacaaggagcccgctgatgtcaaggcgtactggaagcagaaagaaaaggaagtc g10270.t1_mn11037_mat2 gcatggcacaatgagcccgctgatgtcagggcgtattggaagcagaaggagagggaagtc ******* ** * * ** ** * * ** ***** ***** ** * **** g10806_mm10095_MAT2 cgggatgagcacaggaggctacatccagattacaagtacgccccaacagcctcgaagcgt g6681_mn10106_MAT2 cgggatgagcacaggaggctacatccagcttataagtacgccccaacagccccgaagcgt g6891_mb07020_MAT2 cgcaatgagcacaagcagaagcatccggactacaagtatgcaccgacagccccgaagcag g9066.t1_mm99049_mat2 cgggatgagcacaggaggctacatccagattacaagtacgccccaacagcctcgaagcgt g815.t1_mn12262_mat2 cgggatgagcacaggaggctacatccagcttataagtacgccccaacagccccgaagcgt g10270.t1_mn11037_mat2 cgtgacgaacacaagcggcttcacccaggttataaatacgcaccgactgccccgaagcgc ** * ** **** * * ** ** * ** ** ** ** ** ** *** ****** g10806_mm10095_MAT2 gagacgaagaaaccagcgcagaaatctcgccaacccaaggtcgccgctatccccactgcc g6681_mn10106_MAT2 gagacgaagaagctagctcggaaacctcgccaacccgaggccgtcgctaccccaactgtc g6891_mb07020_MAT2 aagggaacgaagccagctcggaagtctcgcctcgccagcgctccctctgcgctaaatact g9066.t1_mm99049_mat2 gagacgaagaaaccagcgcagaaatctcgccaacccaaggtcgccgctatccccactgcc g815.t1_mn12262_mat2 gagacgaagaagctagctcggaaacctcgccaacccgaggccgtcgctaccccaactgtc g10270.t1_mn11037_mat2 gaggcgaagaagccagctcggaaacctcgcgaacccaaggttgcccctacccttgcggtc ** * *** * *** * *** ***** ** * * ** * g10806_mm10095_MAT2 gagttcggaacgtctgacgttgcgcaaagccaagtaaaggagacggttgagagtg---ag g6681_mn10106_MAT2 gagtttagatcgcctgacgttgcgcaaagcccagcaaaggagattgttgaaagcg---ag g6891_mb07020_MAT2 ggactggttacgtctgattctaatcaagatcctgcaaaggagttcgtcaagcggcagcag g9066.t1_mm99049_mat2 gagttcggaacgtctgacgttgcgcaaagccaagtaaaggagacggttgagagtg---ag g815.t1_mn12262_mat2 gagtttagatcgcctgacgttgcgcaaagcccagcaaaggagattgttgaaagcg---ag g10270.t1_mn11037_mat2 gagctgggaatatctgcggttgcgcaaagcccaacaaagactatcgttcaggacg---ag * * *** * *** * **** ** * ** g10806_mm10095_MAT2 atcattctcaccgagagcactgattctgccgacaagcttgcgtctgctgatgcccttggc g6681_mn10106_MAT2 atcttttctaccgagagcactgattctgcagacaagcttgcgtctactgaagctcttggc g6891_mb07020_MAT2 accgcctccaccgaagccactggctatgccgacaagcacacgattcctgtggcttctggc g9066.t1_mm99049_mat2 atcattctcaccgagagcactgattctgccgacaagcttgcgtctgctgatgcccttggc g815.t1_mn12262_mat2 atcttttctaccgagagcactgattctgcagacaagcttgcgtctactgaagctcttggc g10270.t1_mn11037_mat2 atcatctcctttaagactgctgattctaccgacaagtttgtatctaccgaagccattggc * * * *** * * * ****** * * * ** **** g10806_mm10095_MAT2 gatgtatcaacacctcaagaggtcacaaacattgcgttgtttagcccgagcattgacgtc g6681_mn10106_MAT2 aatgtaccaacacctcaagaggtcacaagcattgcgttgttcagcccgagcattggcgtc g6891_mb07020_MAT2 agcactccgatagtcgcagagatcctggacattgcgccattcagtccgagtattgtcttt g9066.t1_mm99049_mat2 gatgtatcaacacctcaagaggtcacaaacattgcgttgtttagcccgagcattgacgtc g815.t1_mn12262_mat2 aatgtaccaacacctcaagaggtcacaagcattgcgttgttcagcccgagcattggcgtc g10270.t1_mn11037_mat2 aacatgtcaacactgcaagaggtcaccaatattgcttcggtcagcccgatcattgacgtt * * * **** ** ***** * ** **** **** * * g10806_mm10095_MAT2 caactaccatcaccacagccagatgccactctaacgtccg------atcag g6681_mn10106_MAT2 caactcccatcaccgcagccagctgctgctctcaagtccgacaagtcc------gatcag g6891_mb07020_MAT2 caattctcaccacagtcaacagatgcattcgagactgcatgccccgttgcagctgcacaa g9066.t1_mm99049_mat2 caactaccatcaccacagccagatgccactctaacgtccg------atcag g815.t1_mn12262_mat2 caactcccatcaccgcagccagctgctgctctcaagtccgacaagtcc------gatcag g10270.t1_mn11037_mat2 caactcccgttgccgctatcaaatgccgctctccagattgatcacgcc------accgca *** * * * ** *** g10806_mm10095_MAT2 a------ccgccacatcctcacttgcgacaaaggaccagcaatccgatctctacgattat g6681_mn10106_MAT2 g------ccgccgcatcctcacttgcgccggaggatccggaatccagtctcttcgactat g6891_mb07020_MAT2 gagcccacgaaccaagtcctgtcggagccagcgtgccaggagaccctcaacctcgaatac g9066.t1_mm99049_mat2 a------ccgccacatcctcacttgcgacaaaggaccagcaatccgatctctacgattat g815.t1_mn12262_mat2 g------ccgccgcatcctcacttgcgccggaggatccggaatccagtctcttcgactat g10270.t1_mn11037_mat2 a------a---ctcactcgcaccagcaccagtggaccaggaacccggtgaactcaatcgc * * * * * * * * * ** * * g10806_mm10095_MAT2 atcaccgaatatctgaacgccaatccagccatcgatatcctggccactaacg------g6681_mn10106_MAT2 attaacgagtatctagacgccaatccaaccattgatattctggccgaagactttgtcata g6891_mb07020_MAT2 atcgaagcattcctgaagatcaatccaaccacggattttgaggtcaagcccattctcggc g9066.t1_mm99049_mat2 atcaccgaatatctgaacgccaatccagccatcgatatcctggccactaacg------364

g815.t1_mn12262_mat2 attaacgagtatctagacgccaatccaaccattgatattctggccgaagactttgtcata g10270.t1_mn11037_mat2 attaccgatttcttgaactccaattcaatcaatcatgtcctggcttacaacgctgacgct ** * * * * **** ** ** ** * ** * g10806_mm10095_MAT2 ------acattcctgtcgtgactgccactgccaccgacctttct g6681_mn10106_MAT2 ggcaccaacgacatcactgccatcattgacacaaccactgacactcccgccgacctttct g6891_mb07020_MAT2 gcctctgacagcggaccta---ctgttgaaagcgccactga---ttctgtcgatcctgct g9066.t1_mm99049_mat2 ------acattcctgtcgtgactgccactgccaccgacctttct g815.t1_mn12262_mat2 ggcaccaacgacatcactgccatcattgacacaaccactgacactcccgccgacctttct g10270.t1_mn11037_mat2 gtccctaactgcatctctgc------cgccatcgacacagccactgacgtatcc * * * ** * g10806_mm10095_MAT2 cccacgtctcacgccactcctgccgacactcaagatcagagcttcttggcccgtgaagtg g6681_mn10106_MAT2 ccaacgtctcacgccactcctgccgacactcaagatcagagcttcttagatagtggactg g6891_mb07020_MAT2 tcaatcaacgccaccactcctaccgacgctcgaggccagggcatgctgggcgaatgctgg g9066.t1_mm99049_mat2 cccacgtctcacgccactcctgccgacactcaagatcagagcttcttggcccgtgaagtg g815.t1_mn12262_mat2 ccaacgtctcacgccactcctgccgacactcaagatcagagcttcttagatagtggactg g10270.t1_mn11037_mat2 tccatgtttggcgccattcctgtcaacactcaagaccagagcttattggactttgggaaa * * * *** **** * ** *** ** *** ** * * * g10806_mm10095_MAT2 ttcccagcagtcaacgccaccagtgaattcacagaccgcattgccgccatcctcctcatg g6681_mn10106_MAT2 tttccagaagccaacgccaccggtgaaatcatagacagcattgccaccacctttctcgtg g6891_mb07020_MAT2 ctcaaggacagcattttcgcaaacacaat------cctcgag g9066.t1_mm99049_mat2 ttcccagcagtcaacgccaccagtgaattcacagaccgcattgccgccatcctcctcatg g815.t1_mn12262_mat2 tttccagaagccaacgccaccggtgaaatcatagacagcattgccaccacctttctcgtg g10270.t1_mn11037_mat2 ctcctggaagggaacacccccaatgaacccacagacagccttcccaccaccacctttg-- * * * * * * * g10806_mm10095_MAT2 ggcagcgactccacaagcatatttcccaacagcaacgccaccagcacaattcccgaaaac g6681_mn10106_MAT2 ggcaccgataccacaaacatattccctgacagcaacgccaccatcataattcccgacgat g6891_mb07020_MAT2 agcaacgagtccagtgcactcgtaggcgccaacaggaccaccgtcgtcgcagatagcagc g9066.t1_mm99049_mat2 ggcagcgactccacaagcatatttcccaacagcaacgccaccagcacaattcccgaaaac g815.t1_mn12262_mat2 ggcaccgataccacaaacatattccctgacagcaacgccaccatcataattcccgacgat g10270.t1_mn11037_mat2 ------tcatgggcagcgacgctacaagcatgttttccgacaac * ** * * ** * g10806_mm10095_MAT2 aacacggttgccatgttgccgcctgtggacctggcatctagcgatccagccggcacgtct g6681_mn10106_MAT2 agcacgaccgacatgttcccgcctgtggacctgccatctagtggcccagccgacacgtct g6891_mb07020_MAT2 aacgcgtgcgtcatgtccccgctcgcggccatcgcgtctagcgatcccgtcaa---gtct g9066.t1_mm99049_mat2 aacacggttgccatgttgccgcctgtggacctggcatctagcgatccagccggcacgtct g815.t1_mn12262_mat2 agcacgaccgacatgttcccgcctgtggacctgccatctagtggcccagccgacacgtct g10270.t1_mn11037_mat2 agcacgacagctatgcccctgcctatcgacctggcatctggcgatgcagtcgacataact * * ** * *** * ** * * * * *** * * * * * ** g10806_mm10095_MAT2 tcggcgacattggacattgatgagttcatcaacacagaaatgtttcaggaccacccctcc g6681_mn10106_MAT2 tcgccgacagtggacgttgatgagttcatcaatacagacatgttccaggatcacctctct g6891_mb07020_MAT2 tcttcggcttttgacaccgatgagttcttcaacatggacgtgttcgcggcccatccgccc g9066.t1_mm99049_mat2 tcggcgacattggacattgatgagttcatcaacacagaaatgtttcaggaccacccctcc g815.t1_mn12262_mat2 tcgccgacagtggacgttgatgagttcatcaatacagacatgttccaggatcacctctct g10270.t1_mn11037_mat2 tc---gggttcggcctttgatgagttcatcaacaccgaaatgtttcaagaccacccctct ** * * * ********* **** * ** **** * ** * * g10806_mm10095_MAT2 accagcttgtctgtggatttctccgccgccaacgaggaatttgattttgacgttgcaaac g6681_mn10106_MAT2 actagcttgtctgtggatttcgcggccgtcgacgaggagtcggagtttgactttgacttt g6891_mb07020_MAT2 accggccttctcggggatcatcataccgagcctgttgattttgatctcatatttgacaac g9066.t1_mm99049_mat2 accagcttgtctgtggatttctccgccgccaacgaggaatttgattttgacgttgcaaac g815.t1_mn12262_mat2 actagcttgtctgtggatttcgcggccgtcgacgaggagtcggagtttgactttgacttt g10270.t1_mn11037_mat2 atcaactttttgatggactcctttgtcaatggcgatgagttcaaccttgactttaaaaac * * * *** * * ** * * * ** g10806_mm10095_MAT2 tac------g6681_mn10106_MAT2 gcaaactac------g6891_mb07020_MAT2 tgcctgtctgggaatttgcaggct g9066.t1_mm99049_mat2 tac------g815.t1_mn12262_mat2 gcaaactac------g10270.t1_mn11037_mat2 tac------

365

Appendix 5.5 Alignment of APN2 coding sequences

g10805_mm10095_APN2 ------atgggc g6680_mn10106_APN2 tgccgccacgaccgaccgctgttgcaccacatctacgccacctccctcgtcgacatgggc g6890_mb07020_APN2 ------atgggc g9067.t1_mm99049_apn2 ------atgggc g816.t1_mn12262_apn2 ------atgggc g10269.t1_mn11037_apn2 ------atgggc ****** g10805_mm10095_APN2 attcgcatcacttcgtggaatgtcaacggcattcgcaatccgtttgggtaccagccatgg g6680_mn10106_APN2 attcgcatcacttcgtggaatgtaaacggcattcgcaatccatttgggtaccagccatgg g6890_mb07020_APN2 atccgcatcacatcatggaatgtgaatggcattcgcaacccgttcggataccagccatgg g9067.t1_mm99049_apn2 attcgcatcacttcgtggaatgtcaacggcattcgcaatccgtttgggtaccagccatgg g816.t1_mn12262_apn2 attcgcatcacttcgtggaatgtaaacggcattcgcaatccatttgggtaccagccatgg g10269.t1_mn11037_apn2 attcgcatcacttcgtggaatgtcaacggcattcgcaatccgttcgggtaccagccatgg ** ******** ** ******** ** *********** ** ** ** ************ g10805_mm10095_APN2 cgggagacacggagtttcg------g6680_mn10106_APN2 cgggagacacggagtttcgaggtgagtgtgctctgctttgcctgtcaggacctgtcgaaa g6890_mb07020_APN2 cg------g9067.t1_mm99049_apn2 cgggagacacggagtttcga------g816.t1_mn12262_apn2 cgggagacacggagtttcgaggtgagtgtgctctgctttgcctgtcaggacctgtcgaaa g10269.t1_mn11037_apn2 cgggagacgcggagtttcgagctcaccc------** g10805_mm10095_APN2 ------agtccatgttcgatattctcgaggcagacattgtcgtc g6680_mn10106_APN2 cagttggctcaccctgttgtccagtccatgttcgacattctcgaagcagacattgtcgtc g6890_mb07020_APN2 -----ggagacacggactttcgagtccatgttcgacatcctcgaggcagacattgtcgtc g9067.t1_mm99049_apn2 --gttggctcaccctgccgttcagtccatgttcgatattctcgaggcagacattgtcgtc g816.t1_mn12262_apn2 cagttggctcaccctgttgtccagtccatgttcgacattctcgaagcagacattgtcgtc g10269.t1_mn11037_apn2 ------cgggcaaccagtccatgttcgacatcctcgaggcagacattgtcgtc ************* ** ***** *************** g10805_mm10095_APN2 atgcaagagctcaagatacaacgcaaggatctccgagacgacatggtcctcgtgcccgga g6680_mn10106_APN2 atgcaggagctcaagatccaacgcaaggatcttcgagatgacatggtcctcgtgcccgga g6890_mb07020_APN2 atgcaggaactcaagatccagcgcaaagatcttcgagatgatatggtcctcgttcccggg g9067.t1_mm99049_apn2 atgcaagagctcaagatacaacgcaaggatctccgagacgacatggtcctcgtgcccgga g816.t1_mn12262_apn2 atgcaggagctcaagatccaacgcaaggatcttcgagatgacatggtcctcgtgcccgga g10269.t1_mn11037_apn2 atgcaagagctcaagatccaacgcaaggatcttcgggatgacatggtcctcgtgcccgga ***** ** ******** ** ***** ***** ** ** ** *********** ***** g10805_mm10095_APN2 tgggacgtgtatttcagcttgcccaagcataagaaaggctattcgggtgtggctatctac g6680_mn10106_APN2 tgggacgtatacttcagcttgcccaagcacaagaaaggctattcgggtgtagccatctac g6890_mb07020_APN2 tgggacgtctacttcagcttgcccaagcacaaaaaaggctactcgggtgttgccatttat g9067.t1_mm99049_apn2 tgggacgtgtatttcagcttgcccaagcataagaaaggctattcgggtgtggctatctac g816.t1_mn12262_apn2 tgggacgtatacttcagcttgcccaagcacaagaaaggctattcgggtgtagccatctac g10269.t1_mn11037_apn2 tgggacgtgtacttcagcttgcccaagcacaagaaaggctattcgggtgtggctatttac ******** ** ***************** ** ******** ******** ** ** ** g10805_mm10095_APN2 acaagaaactccgtctgctcccccatccgagcagaagaaggcatcaccggctgcctgact g6680_mn10106_APN2 acaagaaactctgtctgctcacccatccgagcagaagaaggcatcaccggctgcctgact g6890_mb07020_APN2 acaaggaactctgtctgcgctccgatacgagcagaagaaggcatcaccggctgcttgact g9067.t1_mm99049_apn2 acaagaaactccgtctgctcccccatccgagcagaagaaggcatcaccggctgcctgact g816.t1_mn12262_apn2 acaagaaactctgtctgctcacccatccgagcagaagaaggcatcaccggctgcctgact g10269.t1_mn11037_apn2 acaagaaactccgtctgctcccccatccgagcagaagaaggcatcaccggctgcctggca ***** ***** ****** * ** ** *************************** ** * g10805_mm10095_APN2 gccccaggtagcaccactgcctaccgagacctgccagaagatcagcagattggtggctat g6680_mn10106_APN2 tccccaggcagcaccactgcctaccgagatctcccagaagaccagcagattggtggctat g6890_mb07020_APN2 cctccaggcagcatgactgcttttcgagaccttccagaggaccagcagattgggggctat g9067.t1_mm99049_apn2 gccccaggtagcaccactgcctaccgagacctgccagaagatcagcagattggtggctat g816.t1_mn12262_apn2 tccccaggcagcaccactgcctaccgagatctcccagaagaccagcagattggtggctat g10269.t1_mn11037_apn2 cctccgggcagcaccattgcctaccgagatcttccagaagaccagcagattggtggctac * ** ** **** * *** * ***** ** ***** ** *********** ***** g10805_mm10095_APN2 cctcagcctgggcagcttcctggagatgtcgatgatgtcgttctcgacagtgagggtcgt 366

g6680_mn10106_APN2 cctcggcccgggcagcttcctggagacgtcgacgatgtcgttctcgacagtgagggtcgt g6890_mb07020_APN2 cctcaagctgggcagctccccggagacattgacgatgtcgcgctcgacagcgaaggtcgt g9067.t1_mm99049_apn2 cctcagcctgggcagcttcctggagatgtcgatgatgtcgttctcgacagtgagggtcgt g816.t1_mn12262_apn2 cctcggcccgggcagcttcctggagacgtcgacgatgtcgttctcgacagtgagggtcgt g10269.t1_mn11037_apn2 cctcagccttggcagcttcctgcagacgtcgaggacgtcgttctcgacagcgaaggtcgt **** * ******* ** * *** * ** ** **** ******** ** ****** g10805_mm10095_APN2 tgcgtcattctcgagttccctgcgtttgtcatcatcggcacctacagcccagcaaacagc g6680_mn10106_APN2 tgcgtcattctcgaattccctgcatttgtcatcattggcacctacagcccagcaaacagc g6890_mb07020_APN2 tgcgtgatcctcgagtttccagcattcgtcatcatcggcacctatagtccggcaaacagc g9067.t1_mm99049_apn2 tgcgtcattctcgagttccctgcgtttgtcatcatcggcacctacagcccagcaaacagc g816.t1_mn12262_apn2 tgcgtcattctcgaattccctgcatttgtcatcattggcacctacagcccagcaaacagc g10269.t1_mn11037_apn2 tgcgtcattctcgagttccctgcttttgtcctaataggcacctacagtccagcaaatagc ***** ** ***** ** ** ** ** *** * ** ******** ** ** ***** *** g10805_mm10095_APN2 gacggcacgcgagacgacttcaaaatcggctacctgggggcactggacgtgcgaatacgc g6680_mn10106_APN2 gacggctcgcgggacgacttcagaatcggctacatgggggcactggacgtgcggatacgc g6890_mb07020_APN2 gatggaacacgagacgatttcaaagtgggttatctcgacgctctcgacgtgcggatacgg g9067.t1_mm99049_apn2 gacggcacgcgagacgacttcaaaatcggctacctgggggcactggacgtgcgaatacgc g816.t1_mn12262_apn2 gacggctcgcgggacgacttcagaatcggctacatgggggcactggacgtgcggatacgc g10269.t1_mn11037_apn2 gacggcacgcgggatgactttaaagtcggctaccaaggggccttggacatgcggatacgc ** ** * ** ** ** ** * * * ** ** * ** * *** **** ***** g10805_mm10095_APN2 aacctcactgcaatggggaagcaggtcgtcctgacaggcgacctcaacgtgatacgagac g6680_mn10106_APN2 aacctcaccgcaatggggaagcaggtcatcctgacaggcgatctcaacgtgatacgagac g6890_mb07020_APN2 aacctcatcgcaatggggaagcaagtcgtcctgacgggcgatcttaacgtggtacgggat g9067.t1_mm99049_apn2 aacctcactgcaatggggaagcaggtcgtcctgacaggcgacctcaacgtgatacgagac g816.t1_mn12262_apn2 aacctcaccgcaatggggaagcaggtcatcctgacaggcgatctcaacgtgatacgagac g10269.t1_mn11037_apn2 aaccttgccgcgatgggaaagcaggttgtactgactggcgatctcaatgtgatacgagat ***** ** ***** ***** ** * ***** ***** ** ** *** **** ** g10805_mm10095_APN2 gtgattgatactgcaggcctgcacgagagactccgaaaagaaggcatgaccatggacgac g6680_mn10106_APN2 gtgattgacgctgctggcctgcatgacagactcagaaaagaaggcatgaccatagacgac g6890_mb07020_APN2 gtgatcgatacggctggtttgtacgaaagactcaggaaggaaggtatgaccctggacgac g9067.t1_mm99049_apn2 gtgattgatactgcaggcctgcacgagagactccgaaaagaaggcatgaccatggacgac g816.t1_mn12262_apn2 gtgattgacgctgctggcctgcatgacagactcagaaaagaaggcatgaccatagacgac g10269.t1_mn11037_apn2 gtgattgacactgctggcctgcacgagagactcaggaaagaaggcatgaccatggacgac ***** ** * ** ** ** * ** ****** * ** ***** ****** * ****** g10805_mm10095_APN2 tactttaccatgccgtcgcggcgtgttttcacccagctggtcattggggccaaggtcaaa g6680_mn10106_APN2 tactttaccatgccgtcgcagcgcgtcttcacccagctggtcattggggccaaggtcaaa g6890_mb07020_APN2 tacttcaccatgccgtcgcggcgtatcttcacacaactggtcattggagccaaagtcaga g9067.t1_mm99049_apn2 tactttaccatgccgtcgcggcgtgttttcacccagctggtcattggggccaaggtcaaa g816.t1_mn12262_apn2 tactttaccatgccgtcgcagcgcgtcttcacccagctggtcattggggccaaggtcaaa g10269.t1_mn11037_apn2 tacttcaccatgccttcgcggcgcgtcttcacccagctggtcatcggggccaaggtcaga ***** ******** **** *** * ***** ** ******** ** ***** **** * g10805_mm10095_APN2 ggcggccgggacgagggtcgagaaaaaccggtcttgtgggaccttgggcggctctttcat g6680_mn10106_APN2 ggcggccgggacgaaggtcgagaaaagccgatcttgtgggacctcgggcgactcttccat g6890_mb07020_APN2 ggcggccgagacgaaggtcgagagaagccagttttgtgggacttgggtcggcacttccat g9067.t1_mm99049_apn2 ggcggccgggacgagggtcgagaaaaaccggtcttgtgggaccttgggcggctctttcat g816.t1_mn12262_apn2 ggcggccgggacgaaggtcgagaaaagccgatcttgtgggacctcgggcggctcttccat g10269.t1_mn11037_apn2 ggcggccgggacgaaggtcgagaaaagccagtcttgtgggacctggggcggctgttccat ******** ***** ******** ** ** * ********* * ** ** * ** *** g10805_mm10095_APN2 cctgatcgccagggcatgtatacatgttgggacaccaagaagaactcgagaccgggaaac g6680_mn10106_APN2 cctgatcgccagggcatgtacacatgttggaacaccaagacaaactcgagaccgggaaac g6890_mb07020_APN2 cccgatcggcagggcatgtacacttgttgggacaccaagaagaattcaagaccaggcaac g9067.t1_mm99049_apn2 cctgatcgccagggcatgtatacatgttgggacaccaagaagaactcgagaccgggaaac g816.t1_mn12262_apn2 cctgatcgccagggcatgtacacatgttggaacaccaagacaaactcgagaccgggaaac g10269.t1_mn11037_apn2 cccgaccgccagggcatgtacacatgctgggacaccaagaagaactcgagaccaggaaac ** ** ** *********** ** ** *** ********* ** ** ***** ** *** g10805_mm10095_APN2 tatgggagtcgcattgactatgtcctatgcagcaatggcatgaaagactggttcgcgacc g6680_mn10106_APN2 tacggcagtcgcattgactttgtcctatgcagtaatggcataaaagactggttcgcgacc g6890_mb07020_APN2 tttggcagtcggatcgactatgtcctgtgcagcaacggcatgaaagactggttcgccgac g9067.t1_mm99049_apn2 tatgggagtcgcattgactatgtcctatgcagcaatggcatgaaagactggttcgcgacc g816.t1_mn12262_apn2 tacggcagtcgcattgactttgtcctatgcagtaatggcataaaagactggttcgcgacc g10269.t1_mn11037_apn2 tttggtagccgtattgactatgtcttgtgcaccaacggcatgaaagactggttcacggcc 367

* ** ** ** ** **** **** * **** ** ***** ************ * * g10805_mm10095_APN2 tctgacatccaggaaggactgatgggttcagaccattgcccagtatttgctgtaatgaac g6680_mn10106_APN2 tccaacatccaggaaggactgatgggttcagaccattgcccagtatttgctgtcatgagc g6890_mb07020_APN2 tcgaatatccaggaaggactcatgggctcggaccactgcccagtatttgctatcatgaag g9067.t1_mm99049_apn2 tctgacatccaggaaggactgatgggttcagaccattgcccagtatttgctgtaatgaac g816.t1_mn12262_apn2 tccaacatccaggaaggactgatgggttcagaccattgcccagtatttgctgtcatgagc g10269.t1_mn11037_apn2 tccgacatccaggaaggcttaatgggctcggaccactgtccagtatttgctgtcatgaaa ** * *********** * ***** ** ***** ** ************ * **** g10805_mm10095_APN2 gacattgtctcactaggtggcaaagacactcacctgcgcgacatcatgaaccccgctggt g6680_mn10106_APN2 gacattgtctcactaggtggtaaagacactcacctgcgcgatatcatgaaccctgctggc g6890_mb07020_APN2 gacattgtctcgatagatggcaaagatgtccatctacgagacatcatgaacccagccggc g9067.t1_mm99049_apn2 gacattgtctcactaggtggcaaagacactcacctgcgcgacatcatgaaccccgctggt g816.t1_mn12262_apn2 gacattgtctcactaggtggtaaagacactcacctgcgcgatatcatgaaccctgctggc g10269.t1_mn11037_apn2 gacgtcatctcactagatggcaaagatatccacctgcgagacattatgaacccgattggc *** * **** *** *** ***** ** ** ** ** ** ******** ** g10805_mm10095_APN2 acctttgagaatggtcagagaatacaagagtggtcgcccaagaaccttctcccactatca g6680_mn10106_APN2 acctttgagaatggccagagactacaggaatggtcacccaagaaccttctcccactatca g6890_mb07020_APN2 accttcgagaacggccgcaggctgcaagaatggtcgccgaagaaccttctcccactatcg g9067.t1_mm99049_apn2 acctttgagaatggtcagagaatacaagagtggtcgcccaagaaccttctcccactatca g816.t1_mn12262_apn2 acctttgagaatggccagagactacaggaatggtcacccaagaaccttctcccactatca g10269.t1_mn11037_apn2 acatttgagaatggccagagactacaggagtggtcgcccaagaaccttctcccactatca ** ** ***** ** * ** * ** ** ***** ** ******************** g10805_mm10095_APN2 gcaaagctgatcgcagagttcgaccgccggcagagcataaaagacatgttcttcaaaaag g6680_mn10106_APN2 gcaaagctgatcgcagagttcgaccgccggcagagcatcaaagccatgttcttcaaaaag g6890_mb07020_APN2 gcgaaactgatcgcggaattcgaccgtcgacagagcatcaaagacatgtttttcaaaaag g9067.t1_mm99049_apn2 gcaaagctgatcgcagagttcgaccgccggcagagcataaaagacatgttcttcaaaaag g816.t1_mn12262_apn2 gcaaagctgatcgcagagttcgaccgccggcagagcatcaaagccatgttcttcaaaaag g10269.t1_mn11037_apn2 gccaaactgattgcagagttcgaccgtcggcagaacatcaaggacatgttcttcaaaaag ** ** ***** ** ** ******** ** **** *** ** * ****** ********* g10805_mm10095_APN2 ccagcgtcgtccacatcaagccaggctgtacctttaacgaatggcttggccggtgctgcc g6680_mn10106_APN2 cctgcgtcgtctacctcaagccagactgtgcctttgacgaatggtccggcgggtgctgac g6890_mb07020_APN2 ccagcgccgtcagcctcaagtcaggctctgcctctgaccaacagctcagcggaggtggcc g9067.t1_mm99049_apn2 ccagcgtcgtccacatcaagccaggctgtacctttaacgaatggcttggccggng---cc g816.t1_mn12262_apn2 cctgcgtcgtctacctcaagccagactgtgcctttgacgaatggtccggcgggtgctgac g10269.t1_mn11037_apn2 ccaacgtcgctgacctcaagccaatttgcgcctctgacaaatagtttggaggtggctgcc ** ** ** * ***** ** * *** * ** ** * * * * * g10805_mm10095_APN2 gtgcccacagccactccgcctacatcacaacaaatcactgataccacgccgtgtcttagc g6680_mn10106_APN2 gcgcccacaaccattccgtccacaccacaacaaatcaccgataccatgccgtgttttacc g6890_mb07020_APN2 atgcccgagcatttgccgcccacaccagtacaagttcccaatcccacgccagcccttgac g9067.t1_mm99049_apn2 gtgcccacagccactccgcctacatcacaacaaatcactgataccacgccgtgtcttagc g816.t1_mn12262_apn2 gcgcccacaaccattccgtccacaccacaacaaatcaccgataccatgccgtgttttacc g10269.t1_mn11037_apn2 atccccacactctcagcgaccgcaccaggacaatttgctggtagcacgccgggtcttgct *** ** * ** ** **** * * * ** *** ** g10805_mm10095_APN2 cacgtggaatcaccagcagctgctccaacttcatcaagtggaggactcaaacgcacacca g6680_mn10106_APN2 cacacggaatcatcagcagctgccccgaggtcatcaaacggagaactcaaacgcacaccc g6890_mb07020_APN2 tatgcagaagcatcctccgcggtcccaagggcagccagtggggccctcaagcgcacgcca g9067.t1_mm99049_apn2 cacgtggaatcaccagcagctgctccaacttcatcaagtggaggactcaaacgcacacca g816.t1_mn12262_apn2 cacacggaatcatcagcagctgccccgaggtcatcaaacggagaactcaaacgcacaccc g10269.t1_mn11037_apn2 cacgtggaatcacccacggcttccccaaggccatcaaatggagggctcaaacgcacgccg * *** ** * * ** ** * ** * * ** * ***** ***** ** g10805_mm10095_APN2 tccggctcggtgactccacagagaccaccgaaaaagagcaaagcagccttgccgaaagaa g6680_mn10106_APN2 tcaggctcggtgactccacagaggccagccaaaaagaccaaagcagccttgccaagagaa g6890_mb07020_APN2 tctggctcagtgactccccagaaaccagccaagaagaccaaagcagccttgacgaaggaa g9067.t1_mm99049_apn2 tccggctcggtgactccacagagaccaccgaaaaagagcaaagcagccttgccgaaagaa g816.t1_mn12262_apn2 tcaggctcggtgactccacagaggccagccaaaaagaccaaagcagccttgccaagagaa g10269.t1_mn11037_apn2 tctggctcggcggctctacagaagccagccaagaagaccaaagcagcgttaccccaagaa ** ***** * * *** **** *** * ** **** ********* ** * *** g10805_mm10095_APN2 tcatcaagcaagggggggcagaagagcctgatgggattcttcaagcccaaggccacagcg g6680_mn10106_APN2 tcatcaagcaaggggggccagaagagcctgatgggattcttcaagcccaaggctacagcg g6890_mb07020_APN2 acatcaggcaagg---gccagaagagcttaatgggatttttcaagcccaagacaacacca 368

g9067.t1_mm99049_apn2 tcatcaagcaagggggggcagaagagcctgatgggattcttcaagcccaaggccacagcg g816.t1_mn12262_apn2 tcatcaagtaaggggggccagaagagcctgatgggattcttcaagcccaaggctacagcg g10269.t1_mn11037_apn2 acagcaagtaaaggggggcagaagagcctgatgggattcttcaagcccaagataacaaca ** ** * ** * * ********* * ******** ************ *** * g10805_mm10095_APN2 acctcatcaactgccaccgcgcttaccaatatgcactcagaagagacagtggcgagcccg g6680_mn10106_APN2 acctcatcggctactactgcgcccaccaatctgcaatcagaagagacaacggcgagcccg g6890_mb07020_APN2 gcggattcggcttcggctaatttaaccacttctcgagcagaagagacagcaacaagtcca g9067.t1_mm99049_apn2 acctcatcaactgccaccgcgcttaccaatatgcactcagaagagacagtggcgagcccg g816.t1_mn12262_apn2 acctcatcggctgctactgcgcccaccaatctgcaatcagaagagacaacggcgagcccg g10269.t1_mn11037_apn2 acatcagcggctgccaccacgcccacatttgtgcaatcagaaaacacagcggcgagcccg * * ** * * ** * * ***** * *** * ** ** g10805_mm10095_APN2 ccgtcgagctctgctttgtcaacgcttgatgagcactaccaacagcaacaacaagtttca g6680_mn10106_APN2 ccgtcgagctctgctttgtcaaatctcgatgagcaccaccaa---caacaacaagtttcg g6890_mb07020_APN2 ccgccaagctccgccatgtcagaattcgacgagcaaccgcaa------cagaaagtttcg g9067.t1_mm99049_apn2 ccgtcgagctctgctttgtcaacgcttgatgagcactaccaacagcaacaacaagtttca g816.t1_mn12262_apn2 ccgtcgagctctgctttgtcaaatctcgatgagcaccaccaa---caacaacaagtttcg g10269.t1_mn11037_apn2 ccgccgagttctgccctgtcggagcttgacgagcaccaaa---agcagcaacaagtctcg *** * ** ** ** **** * ** ***** ** **** ** g10805_mm10095_APN2 ccaaca---caagacccgcctttccctcccagccctgaaaaggtcatcgatcaggtgcag g6680_mn10106_APN2 ccaaaa---caagacccgcctgtccctccaagccctgaaagggtcatcgatcaggtgcag g6890_mb07020_APN2 ccaaaacatggcgggcctaccatccccccaagccccgaaagggtcatcgaccaggtccag g9067.t1_mm99049_apn2 ccaaca---caagacccgcctttccctcccagccctgaaaaggtcatcgatcaggtgcag g816.t1_mn12262_apn2 ccaaaa---caagacccgcctgtccctccaagccctgaaagggtcatcgatcaggtgcag g10269.t1_mn11037_apn2 ccgaaa---caggacgcgcctgtccctcccagccccgaaaggatcattgaccaggtgcag ** * * * * * **** ** ***** **** * **** ** ***** *** g10805_mm10095_APN2 tcccgcgagacatggtcgaagctcctgggcaagagggtcgtgcccaggtgcgagcacggc g6680_mn10106_APN2 tcccgcgagacatggtcgaagctcctgggcaagagggtcgtgcccaggtgcgagcacggc g6890_mb07020_APN2 tcacgggagacttggtcgaaactcctcggcaagcgagtcgtgcccaagtgcgagcacggc g9067.t1_mm99049_apn2 tcccgcgagacatggtcgaagctcctgggcaagagggtcgtgcccaggtgcgagcacggc g816.t1_mn12262_apn2 tcccgcgagacatggtcgaagctcctgggcaagagggtcgtgcccaggtgcgagcacggc g10269.t1_mn11037_apn2 tcccgcgagacgtggtcgaaacttctaggcaagagggtcgtgccgagatgcgaacacggc ** ** ***** ******** ** ** ****** * ******** * ***** ****** g10805_mm10095_APN2 gaggactgtattagcctggttaccaagaaggccggcttcaacaaagggcgctcgttcttc g6680_mn10106_APN2 gaggactgcatcagcctagtcaccaagaaggccggcttcaacaaagggcgctcgttcttc g6890_mb07020_APN2 gaggactgcatcagcctggtcactaagaaggctggcttcaacaaaggacgctggttctac g9067.t1_mm99049_apn2 gaggactgtattagcctggttaccaagaaggccggcttcaacaaaggccc------g816.t1_mn12262_apn2 gaggactgcatcagcctagtcaccaagaaggccggcttcaacaaaggcc------g10269.t1_mn11037_apn2 gaggactgcatcagtctggttaccaagaaggccggcttcaacaagggtaagcaatatttc ******** ** ** ** ** ** ******** *********** ** g10805_mm10095_APN2 atgtgctcacggcct-ataggcccttcgggt------g6680_mn10106_APN2 atgtgctcacggcct-ataggcccttccggt------g6890_mb07020_APN2 atctgcccgcgacccataggtccatcgggcgagaaggaaacggggacg------g9067.t1_mm99049_apn2 ------ttcgggt------g816.t1_mn12262_apn2 ------cttccggt------g10269.t1_mn11037_apn2 ctagtacccacaggacctagtcgcctcatgaagaatgaggctggtgcgatttcaatgcaa

g10805_mm10095_APN2 ------g6680_mn10106_APN2 ------g6890_mb07020_APN2 ------g9067.t1_mm99049_apn2 ------g816.t1_mn12262_apn2 ------g10269.t1_mn11037_apn2 acaattgaggctgacgtatgctcgacatctgaaggacgctcgttcttcatctgcccacga

g10805_mm10095_APN2 ------gagaaggaaaagggaacagagtggcgctgcgggacgtttata g6680_mn10106_APN2 ------gagaaggaaaaagggacagagtggcgctgcgggacgtttata g6890_mb07020_APN2 ------gagtggcgctgtgggacttttatc g9067.t1_mm99049_apn2 ------gagaaggaaaagggaacagagtggcgctgcgggacgtttata g816.t1_mn12262_apn2 ------gagaaggaaaaagggacagagtggcgctgcgggacgtttata g10269.t1_mn11037_apn2 cctataggcccttcgggcgaaaaggagaaaggcacagagtggcgctgtgggacatttata *********** ***** *****

369

g10805_mm10095_APN2 tggagcagcgactggactagcaagagc g6680_mn10106_APN2 tggagcagtgactggactagcaagagc g6890_mb07020_APN2 tggagtagtgactggaccagcaagagc g9067.t1_mm99049_apn2 tggagcagcgactggactagcaagagc g816.t1_mn12262_apn2 tggagcagtgactggactagcaagagc g10269.t1_mn11037_apn2 tggagcagtgactggactagcaaaagc ***** ** ******** ***** ***

370

Appendix 5.6 Alignment of M. majus 99049 MAT-region coding sequences with those of M. nivale 11037, with primer loci indicated for SLA2

g814.t1_mn12262_sla2 ATGGCGTCCGCCCGCAGTCTCGACCACGCAAAGTCCGAGGCCGAGCTGGC 50 g9065.t1_mm99049_sla2 ATGGCGTCCGCCCGCAGTCTCGACCACGCAAAGTCCGAGGCTGAGCTGGC 50 g10271.t1_mn11037_sla2 ATGGCGTCCGCCCGCAGTCTCGACCACGCAAAGTCCGAGGCCGAGCTGGC 50 ***************************************** ********

g814.t1_mn12262_sla2 CATCAACATCAAAAAGGCTACGAGTCCCGAGGAGTCGGCCCCGAAGCGCA 100 g9065.t1_mm99049_sla2 CATTAACATAAAAAAGGCCACAAGTCCCGAGGAGTCGGCCCCGAAGCGCA 100 g10271.t1_mn11037_sla2 CATCAATATCAAAAAAGCCACGAGCCCCGAGGAGTCGGCCCCCAAGCGCA 100 *** ** ** ***** ** ** ** ***************** *******

g814.t1_mn12262_sla2 AGCACGTCCGCAGCTGCATCGTCTACACATGGGACCACAAGTCCTCCCAG 150 g9065.t1_mm99049_sla2 AGCACGTCCGCAGCTGCATCGTCTACACATGGGATCACAAGTCCTCCCAG 150 g10271.t1_mn11037_sla2 AGCATGTCCGCAGCTGCATCGTCTACACATGGGACCACAAGTCTGCCCAG 150 **** ***************************** ******** *****

g814.t1_mn12262_sla2 TCCTTCTGGGCTGGGCTCAAGGTGCAGCCTATCCTCGCCGACGAGGTCCA 200 g9065.t1_mm99049_sla2 TCCTTCTGGGCTGGGCTCAAGGTGCAGCCCATCCTCGCCGACGAAGTCCA 200 g10271.t1_mn11037_sla2 TCCTTCTGGGCTGGGCTCAAGGTGCAGCCCATCCTCGCCGACGAGGTCCA 200 ***************************** ************** *****

g814.t1_mn12262_sla2 GACATACAAGGCGCTCATCACCATTCACAAAGTCCTGCAAGAGGGCCACC 250 g9065.t1_mm99049_sla2 GACTTACAAGGCGCTCATCACCATTCACAAGGTCCTCCAAGAGGGCCACC 250 g10271.t1_mn11037_sla2 GACATACAAGGCGCTCATCACCATCCACAAGGTCCTCCAAGAGGGGCACC 250 *** ******************** ***** ***** ******** ****

g814.t1_mn12262_sla2 CGCAAACCCTCAGAGAGGCGATGGCCAACCGGAGCTGGATCGACAGCCTG 300 g9065.t1_mm99049_sla2 CACAAACTCTCAGGGAGGCAATGGCCAATCGGAGCTGGATCGACAGCCTC 300 g10271.t1_mn11037_sla2 CGCAGACTCTCAGAGAGGCAATGGCCAACCGGAGCTGGATCGACAGCCTT 300 * ** ** ***** ***** ******** ********************

g814.t1_mn12262_sla2 AATAGAGGCATGAGCGGCGAGGGTATGCGTGGATATGCCCCTCTCATTCG 350 g9065.t1_mm99049_sla2 AACAGGGGCATGAGCGGCGAGGGTATGCGTGGATATGCCCCTCTCATTCG 350 g10271.t1_mn11037_sla2 AACAGAGGCATGAGCGGCGAGGGTATGCGTGGATATGCTCCTCTCATTCG 350 ** ** ******************************** ***********

g814.t1_mn12262_sla2 GGAGTATGTATACTTTCTGCTGGCGAAGCTCTCATTCCACCAGCAGCACC 400 g9065.t1_mm99049_sla2 GGAGTATGTATACTTTCTACTGGCGAAGCTCTCCTTTCACCAACAGCACC 400 g10271.t1_mn11037_sla2 GGAGTATGTATACTTCCTGCTGGCGAAGCTTTCGTTCCACCAGCAGCACC 400 *************** ** *********** ** ** ***** *******

g814.t1_mn12262_sla2 CTGAGTTTAACGGCACCTTCGAGTACGAAGAGTACGTCTCGCTCAAGGCC 450 g9065.t1_mm99049_sla2 CTGAATTCAACGGCACTTTCGAGTACGAAGAGTACGTGTCGCTCAAGGCC 450 g10271.t1_mn11037_sla2 CCGAGTTCAACGGCACCTTCGAGTACGAAGAGTATGTGTCGCTCAAGGCC 450 * ** ** ******** ***************** ** ************

g814.t1_mn12262_sla2 ACAAACGACCCTAACGAGGGATATGAGACTATCATGGACCTTATGACGCT 500 g9065.t1_mm99049_sla2 ACAAACGACCCCAACGAGGGGTACGAAACCATCATGGATCTCATGACGCT 500 g10271.t1_mn11037_sla2 ACAAACGACCCCAATGAAGGGTATGAGACCATCATGGACCTCATGACGCT 500 *********** ** ** ** ** ** ** ******** ** ********

Mn_SLA2_357F: CCGCA g814.t1_mn12262_sla2 GCAAGATAAGATCGACCAGTTCCAGAAGCTCATCTTCTCACACTTCCGCA 550 g9065.t1_mm99049_sla2 GCAAGATAAGATCGACCAGTTCCAGAAGCTCATCTTTTCACACTTCCGCA 550 g10271.t1_mn11037_sla2 GCAAGACAAAATCGACCAGTTTCAGAAACTCATCTTCTCACACTTCCGCA 550 ****** ** *********** ***** ******** *************

Mn_SLA2_357F: ATGTCGGCAACA g814.t1_mn12262_sla2 ATGTCGGAAACAACGAGTGCCGCATTTCCTCCCTCGTGCCCCTTGTTGCC 600 g9065.t1_mm99049_sla2 ACGTCGGCAACAACGAGTGCCGTATCTCTTCCCTCGTGCCCCTTGTCGCC 600 g10271.t1_mn11037_sla2 ATGTCGGCAACAACGAGTGCCGTATATCCTCCCTCGTACCACTCGTCGCC 600 * ***** ************** ** ** ******** ** ** ** ***

Mic_SLA2_92F: GCATGCTGCGTGCCATGCA 371

g814.t1_mn12262_sla2 GAGACATACGGCATCTACAAGTTCATCACAAGCATGCTGCGTGCCATGCA 650 g9065.t1_mm99049_sla2 GAGACATATGGCATCTACAAGTTCATCACGAGCATGCTGCGTGCCATGCA 650 g10271.t1_mn11037_sla2 GAAACATACGGTATCTACAAGTTTATCACGAGCATGCTGCGTGCCATGCA 650 ** ***** ** *********** ***** ********************

Mic_SLA2_92F: CTCC g814.t1_mn12262_sla2 CTCCACAACTGGCGATGACGAGGCTCTTGAGCCTCTGCGTGGGCGTTACG 700 g9065.t1_mm99049_sla2 CTCCACAACTGGCGATGACGAGGCTCTCGAGCCGCTGCGTGGACGTTACG 700 g10271.t1_mn11037_sla2 CTCCACAACTGGCGACGACGAGGCTCTTGAGCCGCTGCGTGGACGCTACG 700 *************** *********** ***** ******** ** **** g814.t1_mn12262_sla2 ACGCCCAGCACTACCGGCTTGTCAAATTCTACTATGAGTGCTCCAACTTG 750 g9065.t1_mm99049_sla2 ACGCCCAGCACTACCGACTCGTCAAGTTCTACTATGAGTGCTCCAACTTG 750 g10271.t1_mn11037_sla2 ATGCTCAGCACTACCGACTCGTCAAGTTCTATTACGAATGCTCCAACCTG 750 * ** *********** ** ***** ***** ** ** ********* ** g814.t1_mn12262_sla2 CGCTACCTGACCAGCTTGATCACCATTCCCAAACTGCCACAGGATCCTCC 800 g9065.t1_mm99049_sla2 CGCTACCTGACGAGCTTGATCACTATTCCCAAACTGCCACAGGATCCTCC 800 g10271.t1_mn11037_sla2 CGCTACCTTACCAGCTTAATTACCATTCCCAAACTGCCACAGGATCCTCC 800 ******** ** ***** ** ** ************************** g814.t1_mn12262_sla2 AAACCTCCTGGCTGATGATGAGAATGCGCCAGCGCTTCCTGCTCGCCCCA 850 g9065.t1_mm99049_sla2 GAACCTCTTGGCCGATGATGAGAATGCGCCAGCGCTTCCTGCTCGGCCCA 850 g10271.t1_mn11037_sla2 GAACCTCCTGGCCGAAGACGAGAATGCGCCATCGCTTCCTGCTCGCCCCA 850 ****** **** ** ** ************ ************* **** g814.t1_mn12262_sla2 AGCATGAGATCGAGCGTCAACCGACTCCGGTGCCCCAGCCCAAGACCGAG 900 g9065.t1_mm99049_sla2 AGCAGGAGATCGAGCGCCAGCCGACTCCGGTGCCGCAGCCCAAGACCGAT 900 g10271.t1_mn11037_sla2 AACAGGAGATCGAGCGCCAGCCGACTCCGGTACAGCAGCCTCAGACCGAT 900 * ** *********** ** *********** * ***** ******* g814.t1_mn12262_sla2 GAACCCGACCAAATCGCCGAGTTTTGGCAAGGAGAAATTGAGAGGCAGAA 950 g9065.t1_mm99049_sla2 GAGCCCGACCAAATCGCCGAGTTCTGGCAAGGAGAAATCGAGAGGCAGAA 950 g10271.t1_mn11037_sla2 GAACCCGACCAAATCGCCGAGTTCTGGCAAGGAGAAATTGAGAGGCAAAA 950 ** ******************** ************** ******** ** g814.t1_mn12262_sla2 CAAGGAGTACGAGGATCAGCAGAGGGTGTTGCAGGAGCGCCAGCAGCAAT 1000 g9065.t1_mm99049_sla2 TAAGGAGTACGAGGATCAGCAGAGGGTGCTGCAAGAGCGCCAGCAGCAAT 1000 g10271.t1_mn11037_sla2 CAAGGAGTACGAGGATCAGCAGAGGGTGTTGCAGGAGCGCCAGCAGCAAT 1000 *************************** **** **************** g814.t1_mn12262_sla2 CACTGCTCGCCCAGCAACAAGCACAGATGCAGGCACAGCGGGATTTCGAA 1050 g9065.t1_mm99049_sla2 CACTGCTCGCACAGCAACAAGCACAGATGCAGGCACAGCGGGATTTCGAA 1050 g10271.t1_mn11037_sla2 CACTACTTGCCCAGCAACAAGCACAGATGCAGGCACAGCGGGATTTCGAG 1050 **** ** ** ************************************** g814.t1_mn12262_sla2 GAGCAGCAGCGCCGTCTAGCCGAGCAGCAACAGCGCGAACAGGAGGCGCT 1100 g9065.t1_mm99049_sla2 GAGCAGCAGCGCCGTCTAGCCGAGCAGCAACAGCGCGAACAGGAGGCGCT 1100 g10271.t1_mn11037_sla2 GATCAACAGCGCCGCCTGGCCGAGCAGCAACAGCGCGAACAGGAGGCGCT 1100 ** ** ******** ** ******************************** g814.t1_mn12262_sla2 ACTAGCCCAGCAAGCGCAATGGCAAACACAGGGGCGTCTTGCGGAATTGG 1150 g9065.t1_mm99049_sla2 GCTGGCCCAGCAAGCGCAGTGGCAAACGCAGGGACGTCTTGCGGAATTGG 1150 g10271.t1_mn11037_sla2 CCTTGCCCAGCAAGCGCAATGGCACACGCAGGGACGTCTTGCAGAATTGG 1150 ** ************** ***** ** ***** ******** ******* g814.t1_mn12262_sla2 AGCAGGAGAACCTCAATGCTAGGGCACAGTACGAGCGCGATCAATTAATG 1200 g9065.t1_mm99049_sla2 AGCAGGAGAACCTCAATGCAAGAGCGCAGTACGAGCGCGATCAGTTAATG 1200 g10271.t1_mn11037_sla2 AGCAAGAGAACCTCAATGCCAGGGCACAGTACGAGCGAGACCAGCTCATG 1200 **** ************** ** ** *********** ** ** * *** g814.t1_mn12262_sla2 CTGCAGCAGTACGACCAGCGTGTAAAGGCACTGGAAGGCGAATTGGGGCA 1250 g9065.t1_mm99049_sla2 CTGCAGCAGTACGACCAGCGTGTAAAGGCACTGGAAGGCGAACTGGGGCA 1250 g10271.t1_mn11037_sla2 CTGCAGCAGTACGACCAGCGTGTAAAGGCACTGGAAGGCGAACTTGGGCA 1250 ****************************************** * ***** g814.t1_mn12262_sla2 GATCCAAGGGAGTTACGGCCAGCAGATGACAAGCAAGGATGATCAGATCC 1300 g9065.t1_mm99049_sla2 AATCCAAGCGAGTTACGGCCAGCAGATGACAAGCAAGGACGATCAGATCC 1300 g10271.t1_mn11037_sla2 GATCCAGGCGAGTTACGGCCAGCAGATGACAAGCAAGGACGATCAGATCC 1300 372

***** * ****************************** ********** g814.t1_mn12262_sla2 GCGCACTCCAAGAACAGGTGAACACCTGGCGGAGCAAGTACGAGGCACTC 1350 g9065.t1_mm99049_sla2 GTGCACTCCAAGAACAGGTCAACACCTGGCGGAGCAAGTACGAGGCACTT 1350 g10271.t1_mn11037_sla2 GCGCACTCCAAGAACAGGTCAACACCTGGAGGAGCAAGTACGAAGCACTC 1350 * ***************** ********* ************* ***** g814.t1_mn12262_sla2 GCGAAGCTTTATTCGCAGCTGAGACATGAACATCTCGACCTGCTGCAGAA 1400 g9065.t1_mm99049_sla2 GCGAAGCTTTATTCGCAGCTGAGACATGAACATCTCGATTTGCTGCAGAA 1400 g10271.t1_mn11037_sla2 GCGAAGCTTTACTCGCAGCTGCGACATGAACACCTTGACCTACTGCAGAA 1400 *********** ********* ********** ** ** * ******** g814.t1_mn12262_sla2 GTTCAAGTCCGTGCAACTGAAAGCAGCCTCGGCCCAGGAGGCTATCGAGC 1450 g9065.t1_mm99049_sla2 GTTTAAGTCGGTGCAACTGAAAGCAGCCTCGGCCCAGGAGGCTATCGAGC 1450 g10271.t1_mn11037_sla2 GTTCAAGTCGGTACAATTGAAGGCAGCCTCTGCACAAGAGGCTATCGAGC 1450 *** ***** ** *** **** ******** ** ** ************* g814.t1_mn12262_sla2 GGCGAGAAAAGCTAGAGAGAGAGATCAAGACTAAAAACCTCGAGCTGGCT 1500 g9065.t1_mm99049_sla2 GGCGAGAGAAGCTGGAGCGAGAGATCAAGACCAAAAATCTCGAGCTTGCT 1500 g10271.t1_mn11037_sla2 GGCGGGAGAAGCTGGAGAGAGAGATTAAGACCAAGAATCTCGAGCTGGCT 1500 **** ** ***** *** ******* ***** ** ** ******** *** g814.t1_mn12262_sla2 AACATGATCCGCGAGAGAGACCGCGCATTGCACGACCGCGACCGCCTCAC 1550 g9065.t1_mm99049_sla2 AATATGATCCGCGAGAGAGATCGCGCTCTGCACGACCGCGATCGCCTCAC 1550 g10271.t1_mn11037_sla2 AACATGATACGCGAGAGGGACCGGGCGCTGCACGACCGCGATCGCCTGAC 1550 ** ***** ******** ** ** ** ************* ***** ** g814.t1_mn12262_sla2 AGGCGGAAACAAGGAAGAGCTTGAGAAGCTCAAGAGAGAGTTGCGCATGG 1600 g9065.t1_mm99049_sla2 AGGCGGAAACAAGGAAGAGCTCGAAAAGCTCAAGAGAGAGCTACGCATGG 1600 g10271.t1_mn11037_sla2 CGGCGGCAACAAAGAGGAACTTGAGAAGCTCAAGAGAGAGCTGCGCATGG 1600 ***** ***** ** ** ** ** *************** * ******* g814.t1_mn12262_sla2 CACTTGATCGCGCTGACAACCTCGAGAGAGCCAAAGGAAACGAGCTCTCG 1650 g9065.t1_mm99049_sla2 CACTTGACCGTGCCGACAACCTCGAGAGAGCTAAAGGAAACGAGCTCTCG 1650 g10271.t1_mn11037_sla2 CACTTGATCGTGCTGACAACCTCGAGAGAGCAAAAGGAAATGAGCTTTCA 1650 ******* ** ** ***************** ******** ***** **

Mn_SLA2_1156F : CCT g814.t1_mn12262_sla2 TCAATGCTGTCCAAGTACAACAGAGAGATGGCTGACCTGGAGGAGGCCCT 1700 g9065.t1_mm99049_sla2 TCCATGCTGTCCAAGTATAACAGAGAGATGGCTGATCTGGAGGAGGCCCT 1700 g10271.t1_mn11037_sla2 TCAATGCTGTCTAAGTACAACAGAGAGATGGCTGACTTGGAGGAGGCCCT 1700 ** ******** ***** ***************** ************* Mn_SLA2_1156F: CCGAACCAAGTCGC g814.t1_mn12262_sla2 TCGAACCAAGTCTCGGGCGCTCGAGGAAGCCCAAAACAACATGCGGAGCG 1750 g9065.t1_mm99049_sla2 CCGAACCAAGTCGCGGGCGCTCGAGGAAGCCCAAAGCAACATGCGGAGCG 1750 g10271.t1_mn11037_sla2 CCGCACCAAGTCGCGGGCACTTGAGGAAGCCCAAAACAACATGCGGAGCG 1750 ** ******** ***** ** ************* ************** g814.t1_mn12262_sla2 GCAGCTCTGATCTCGAGCAGCTTCTTAGCGATAAGGAAGAAGAGCTCGAG 1800 g9065.t1_mm99049_sla2 GCAGCTCGGATCTCGAGCAACTTCTCAGCGATAAGGAAGAGGAGCTTGAG 1800 g10271.t1_mn11037_sla2 GCAGCTCGGATCTTGAGCAACTTCTCAGTGACAAGGAAGAAGAGCTCGAG 1800 ******* ***** ***** ***** ** ** ******** ***** *** g814.t1_mn12262_sla2 GTCTACAAGGCCAGTTTGGACCAGGCACTCGTTGAGCTCACCACGCTGAG 1850 g9065.t1_mm99049_sla2 GTCTACAAGGCCAGTCTGGATCAGGCACTCGTCGAGCTCACCACGCTGAG 1850 g10271.t1_mn11037_sla2 GTCTACAAGGCCAGTCTGGACCAAGCACTCGTCGAGCTCACCACGCTGAG 1850 *************** **** ** ******** ***************** g814.t1_mn12262_sla2 AGAGAGCCAAGGCGCTACTGACGAAGCCCTTGACTCTGCTCTTTACGGCG 1900 g9065.t1_mm99049_sla2 AGAGAGCCAAGGTGCTACTGATGAGGCCCTTGACTCTGCTCTTTATGGCG 1900 g10271.t1_mn11037_sla2 AGAGAGCCAGGGTGCTACGGACGAGGCCCTAGACTCCGCCCTTTATGGCG 1900 ********* ** ***** ** ** ***** ***** ** ***** **** g814.t1_mn12262_sla2 CGAACCTCGACAGAATCAACCACATGATCGATTCCGTGTTGGAGGCTGGT 1950 g9065.t1_mm99049_sla2 CCAACCTCGATAGAATCAACCACATGATCGATTCGGTGTTGGAGGCTGGT 1950 g10271.t1_mn11037_sla2 CGAACCTCGACAGAATCAACCACATGATCGATTCGGTGCTGGAGGCTGGT 1950 * ******** *********************** *** *********** g814.t1_mn12262_sla2 GTAGCTCGTGTCGACGACGCTCTTTACGAGCTGGATTCGAGCATGCAAGC 2000 373

g9065.t1_mm99049_sla2 GTGGCTCGTGTCGACGATGCTCTTTACGAACTGGACTCAAGCATGCAGGC 2000 g10271.t1_mn11037_sla2 GTGGCACGTGTCGACGACGCTCTTTACGAACTGGACTCGAGCATGCAGGC 2000 ** ** *********** *********** ***** ** ******** ** g814.t1_mn12262_sla2 TGGTAACCAAAACGCCTCGCCCACCTATGTGCTGTCCCAAATCGAAAAGG 2050 g9065.t1_mm99049_sla2 TGGTAACCAAAACGCCTCGCCCACCTACGTGCTGTCCCAAATCGAGAAGG 2050 g10271.t1_mn11037_sla2 TGGCAATCAAAACGCCTCGCCCACATACGTGCTCTCCCAAATCGAGAAGG 2050 *** ** ***************** ** ***** *********** **** g814.t1_mn12262_sla2 CCTCGGCCACTGCCACAGAATTTGCAACCGCATTCAACGACTTCCTGGCC 2100 g9065.t1_mm99049_sla2 CCTCAGCCACTGCTACAGAATTTGCGACCGCATTTAACGACTTCCTGGCC 2100 g10271.t1_mn11037_sla2 CGTCGGCCACTGCTACAGAATTTGCAACCGCGTTCAACGACTTCCTGGCC 2100 * ** ******** *********** ***** ** *************** g814.t1_mn12262_sla2 GACATTCCCAATGCCGACCACGCTAATGTCATCAAGGCTATCAACGTGTT 2150 g9065.t1_mm99049_sla2 GATATTCCCAACGCCGACCACGCTAACGTCATCAAAGCCATCAACGTGTT 2150 g10271.t1_mn11037_sla2 GACATTCCCAACGCCGACCATTCTAACGTCATCAAGACCATCAACGTGTT 2150 ** ******** ******** **** ******** * *********** g814.t1_mn12262_sla2 CTCCGGCGCTATTGCAGATGTTTGCAGCAATACCAAGGGTCTGACACGCC 2200 g9065.t1_mm99049_sla2 CTCTGGTGCTGTTGCAGATGTCTGCAGCAACACCAAGGGTCTGACACGCC 2200 g10271.t1_mn11037_sla2 CTCTGGTGCTATTGCAGATGTCTGTAGCAACACCAAGGGTCTGACACGCC 2200 *** ** *** ********** ** ***** ******************* g814.t1_mn12262_sla2 TCGCAACTGATGACAAGAAGACTGACCAGCTCATGAACGGTGCCCGAGTA 2250 g9065.t1_mm99049_sla2 TTGCAACTGATGACAAGAAGACCGACCAGCTCATGAACGGTGCCCGAGTG 2250 g10271.t1_mn11037_sla2 TTGCGACTGATGACAAGAAGACTGACCAGCTCATGAATGGTGCCCGAGTA 2250 * ** ***************** ************** *********** g814.t1_mn12262_sla2 GCGGCGCAGTCCGCCATTCGGTTCTTCAGGGGACTCCTGAGCTTCCAGCT 2300 g9065.t1_mm99049_sla2 GCAGCGCAGTCCGCCATTCGATTCTTCAGGGGGCTCCTGAGCTTCCAGCT 2300 g10271.t1_mn11037_sla2 GCGGCGCAATCTGCTATTCGATTCTTCAGGGGACTCCTGAGCTTTCAGCT 2300 ** ***** ** ** ***** *********** *********** ***** g814.t1_mn12262_sla2 AGTAGACAGAGAAGCCGAAGAGAAACAGGACATTGTAATCAATAGCAACA 2350 g9065.t1_mm99049_sla2 GGTAGACAGAGAAGCCGAGGAGAAGCAAGACATCGTGATTAATAGCAACA 2350 g10271.t1_mn11037_sla2 GGTAGACAGGGACGCTGAAGAGAAGCAGGACGTGGTGATCAACAGCAACA 2350 ******** ** ** ** ***** ** *** * ** ** ** ******* g814.t1_mn12262_sla2 TTGATGTCCAGATGAACCTGCAAACTCTGAACAAGCTCGTAGAGACATTC 2400 g9065.t1_mm99049_sla2 TTGATGTCCAGATGAACCTGCAAACTCTGAACAAGCTTGTAGAGACATTC 2400 g10271.t1_mn11037_sla2 TTGATGTTCAGATGAACCTGCAAACTCTGAACAAGCTCATTGAGACATTC 2400 ******* ***************************** * ********* g814.t1_mn12262_sla2 GCGCCCGGCTTCGGAAAACTTGCTACAAACAAGGGCGACATTGGCGACCT 2450 g9065.t1_mm99049_sla2 GCGCCCGGCTTTGGAAAGCTTGCTACAAACAAGGGCGACATCGGCGACCT 2450 g10271.t1_mn11037_sla2 GCGCCCGGCTTCGGAAAGCTTGCTACGAACAAGGGTGATATCGGCGACCT 2450 *********** ***** ******** ******** ** ** ******** g814.t1_mn12262_sla2 TGTTGATTCCGAGCTGAGTAAAGCGGCGGATGCCATTGCCGCTGCTGCTG 2500 g9065.t1_mm99049_sla2 GGTTGATTCCGAGCTGAGTAAAGCGGCGGATGCCATTGCTGCTGCGGCTG 2500 g10271.t1_mn11037_sla2 GGTTGATTCCGAGCTAAGCAAGGCGGCGGATGCTATCGCCGCCGCTGCTG 2500 ************** ** ** *********** ** ** ** ** **** g814.t1_mn12262_sla2 CTCGGCTCGCCAAGCTCAGAAACAAGCCGCGCGACAAATACTCGACCTAC 2550 g9065.t1_mm99049_sla2 CTCGGCTTGCCAAGCTCAGAAACAAGCCACGCGACAAGTACTCGACCTAC 2550 g10271.t1_mn11037_sla2 CTCGGCTCGCCAAGCTCAGAAACAAGCCGCGCGACAAATACTCAACCTAC 2550 ******* ******************** ******** ***** ****** g814.t1_mn12262_sla2 GAGCTCAAGGTTCACGACTCAATCTTGGATGCTGCCCTGGCCATCACGAA 2600 g9065.t1_mm99049_sla2 GAGCTCAAGGTTCACGACTCGATCTTGGATGCTGCCCTGGCCATCACGAA 2600 g10271.t1_mn11037_sla2 GAGCTCAAGGTCCACGACTCGATCTTGGATGCTGCTCTAGCCATCACAAA 2600 *********** ******** ************** ** ******** ** g814.t1_mn12262_sla2 CGCTATCGCCAGGCTCATCAAGGCAGCCACGGTCACGCAGCAGGAAATCG 2650 g9065.t1_mm99049_sla2 CGCTATCGCCAAGCTCATTAAAGCAGCTACAGTCACCCAGCAAGAAATTG 2650 g10271.t1_mn11037_sla2 CGCCATCGCCAGACTAATTAAAGCCGCTACAGTGACCCAGCAGGAAATTG 2650 *** ******* ** ** ** ** ** ** ** ** ***** ***** *

374

g814.t1_mn12262_sla2 TGCAGGCTGGCAGAGGATCATCCTCGAGGACTGCGTTCTACAAGAAGAAC 2700 g9065.t1_mm99049_sla2 TGCAGGCTGGCAGAGGATCATCCTCGAGGACTGCGTTCTACAAGAAGAAT 2700 g10271.t1_mn11037_sla2 TGCAGGCTGGCAGAGGATCGTCCTCGAGGACTGCGTTTTACAAGAAGAAC 2700 ******************* ***************** *********** g814.t1_mn12262_sla2 AACCGTTGGACCGAGGGTCTCATCTCGGCGGCCAAGGCCGTGGCTTCTTC 2750 g9065.t1_mm99049_sla2 AATCGTTGGACCGAGGGTCTCATCTCGGCGGCCAAGGCCGTGGCCTCTTC 2750 g10271.t1_mn11037_sla2 AATCGTTGGACCGAGGGTCTCATCTCGGCAGCCAAGGCCGTAGCCTCTTC 2750 ** ************************** *********** ** ***** g814.t1_mn12262_sla2 GACCAACACTCTTATTGAGACCGCCGACGGTGTGCTTTCTAACCGCAACA 2800 g9065.t1_mm99049_sla2 GACCAATACTCTCATTGAGACCGCCGACGGTGTGCTTTCTAACCGCAACA 2800 g10271.t1_mn11037_sla2 GACCAACACATTAATCGAGACCGCCGATGGTGTGCTGTCTAACCGTAACA 2800 ****** ** * ** *********** ******** ******** **** g814.t1_mn12262_sla2 GCCCCGAACAGCTGATCGTGGCGTCCAACAACGTTGCTGCCTCCACAGCC 2850 g9065.t1_mm99049_sla2 GCCCCGAGCAGCTGATCGTGGCATCCAACAACGTTGCTGCCTCCACAGCC 2850 g10271.t1_mn11037_sla2 GCCCCGAGCAGTTGATTGTCGCGTCCAACAACGTTGCTGCCTCCACAGCG 2850 ******* *** **** ** ** ************************** g814.t1_mn12262_sla2 CAGCTTGTCGCTGCCAGCCGCGTCAAGGCTGGCTTCATGAGCCAGAACCA 2900 g9065.t1_mm99049_sla2 CAGCTTGTCGCTGCCAGCCGTGTCAAGGCTGGCTTCATGAGCCAGAACCA 2900 g10271.t1_mn11037_sla2 CAACTTGTCGCTGCCAGCCGTGTCAAGGCTGGCTTCATGAGTCAGAACCA 2900 ** ***************** ******************** ******** g814.t1_mn12262_sla2 GGACGACCTGGAGCAGGCCAGCAAGGCTGTTGGTGCCGCTTGCCGTTCCC 2950 g9065.t1_mm99049_sla2 GGATGATCTGGAGCAGGCTAGCAAGGCTGTTGGTGCCGCTTGCCGTTCCC 2950 g10271.t1_mn11037_sla2 GGACGACCTGGAGCAGGCCAGCAAGGCCGTTGGCGCTGCTTGCCGCTCCC 2950 *** ** *********** ******** ***** ** ******** **** g814.t1_mn12262_sla2 TGGTCCGGCAGGTGCAAGCGCTTATCAAGGAGCGCTCGAACGAGGAGGAC 3000 g9065.t1_mm99049_sla2 TGGTCCGGCAGGTACAGGCGCTTATCAAGGAGCGCTCGAACGAGGAGGAC 3000 g10271.t1_mn11037_sla2 TGGTCCGGCAGGTGCAGGCGCTTATCAAGGAGCGCTCGAACGAGGAGGAC 3000 ************* ** ********************************* g814.t1_mn12262_sla2 TCGGTGGACTATGGAGCTCTTGGCGCGCACGAGTTTAAGGTGCGGGAAAT 3050 g9065.t1_mm99049_sla2 TCAGTGGACTATGGAGCTCTTGGCGCGCACGAGTTCAAGGTGCGAGAAAT 3050 g10271.t1_mn11037_sla2 TCGGTAGACTATGGAGCTCTTGGTGCGCACGAGTTCAAGGTTCGGGAAAT 3050 ** ** ***************** *********** ***** ** ***** g814.t1_mn12262_sla2 GGAGCAACAG------GTCGAAATCCTCAAGCTTGAGAACTCGT 3088 g9065.t1_mm99049_sla2 GGAGCAACAG------GTCGAAATCCTCAAGCTTGAGAACTCGT 3088 g10271.t1_mn11037_sla2 GGAGCAACAGTTTCTGTCCCAGGTTGAAATCCTCAAGCTCGAGAACTCGT 3100 ********** ** ************** ********** g814.t1_mn12262_sla2 TGTCCGCGGCAAGACACAGGTTGGGTGAGATGCGTAAGATCTCATACCAG 3138 g9065.t1_mm99049_sla2 TGTCCGCGGCAAGACACAGGCTGGGCGAGATGCGCAAGATCTCATACCAG 3138 g10271.t1_mn11037_sla2 TATCCGCGGCAAGACACAGGTTGGGCGAGATGCGCAAGATCTCATACCAG 3150 * ****************** **** ******** *************** g814.t1_mn12262_sla2 GAGGAG 3144 g9065.t1_mm99049_sla2 GAGGAG 3144 g10271.t1_mn11037_sla2 GAGGAG 3156 ******

375

Appendix 5.7 Alignment of M. majus 99049 MAT-region coding sequences with those of M. nivale 11037, with primer loci indicated for MAT1-2-1

g815.t1_mn12262_mat2 ATGCAGCCCAATAACCAAGGCGGTGCCTTCGGTGCCAATGGCGGCGGTTC 50 g9066.t1_mm99049_mat2 ATGCAGCCCAACTACCACGGCGGTGCCTTTGGTGCCAATGGCGACGGTTC 50 g10270.t1_mn11037_mat2 ATGCAGCCAAGCACTCCAGGTGGTGCCTTCGGTACCAATGGCAGCGGCTC 50 ******** * * ** ******** *** ******** *** **

g815.t1_mn12262_mat2 AGGAGGATACGGCCCTCCGGCCCCAGCTCCAGGTCTGCCACAGCCGGCTG 100 g9066.t1_mm99049_mat2 GGGAGGATACGGCCCTCCAGTTCAAGCGTCAGGTCTGCAACAGCCGACTG 100 g10270.t1_mn11037_mat2 TGGAGGATACGGCCTCCCGGTTCCAGCACCAGGTCTGCCACAGCCCTCTG 100 ************* ** * * *** ********* ****** ***

g815.t1_mn12262_mat2 GTGTCGTC---CGCAATCCGGCCCTGGTCCGCCTGCTCCACGAGTGGTAT 147 g9066.t1_mm99049_mat2 GTGTCGTC---CGCAACCCGACTCTCGTCCGCCTGCTCCATGAGTGGCAG 147 g10270.t1_mn11037_mat2 GTATTGCCGACCGCAACCCGGCTCTCTTCCGCCTGCTCCACGAGTGGCAG 150 ** * * * ***** *** * ** ************* ****** *

g815.t1_mn12262_mat2 TACTGCCAACAGCTTCAGCGTCCCGCAGTCGATGTTGTCTGCATCCCTAG 197 g9066.t1_mm99049_mat2 TACTGCCAGCAGCTTCAGCGTCCTGCAGTCGACATTGTCTTTATCCCCAG 197 g10270.t1_mn11037_mat2 TACTGTCAGCAGCTTCAACGTCCCACATTCGACGTTGTCTGCATCCCCAG 200 ***** ** ******** ***** ** **** ****** ***** **

g815.t1_mn12262_mat2 CTCAATCTTTGACCGCTGGTCTGCTCAGGCAAAGAACTTGGTCCGACAAC 247 g9066.t1_mm99049_mat2 CTCAATCTTTGACCGCTGGTCTGCTCAGGCAAGGAACTTGATTAGGCAAC 247 g10270.t1_mn11037_mat2 CTCAATCTTTGACCGCTGGTCGCCCCAGGCAAAAAATATGGTTCGGCAAA 250 ********************* * ******* ** ** * * ***

g815.t1_mn12262_mat2 TACACGGTGCTTCAACCCGGAAGGATGCGGTCTTCTGTTTTGACTCCTAC 297 g9066.t1_mm99049_mat2 TACACGGTGCTTCCACCCGGAAGGATGTGGTCTTCTGTTTTGACTCCTAC 297 g10270.t1_mn11037_mat2 TGCACGGTGCCACCACCCGCAAGGATGTGGTTTACTGTTTTGACTCTTAT 300 * ******** * ***** ******* *** * ************ **

g815.t1_mn12262_mat2 AGCTCGGGCCATGTGTACCTGGGGGCTTTGATGGATTTCATTGTAGCTGG 347 g9066.t1_mm99049_mat2 AGCTCGGGTCATGTGTATCTGGGCGCTTTGATGGATTTCATTGTAGCTGG 347 g10270.t1_mn11037_mat2 GATTCGGGCCGTATATACCTGGGGTCCCTAATGGATTTCATCTCGGCCGG 350 ***** * * * ** ***** * * *********** ** **

g815.t1_mn12262_mat2 CTACTGGATCCATCAAATGGCCGGAAGCAGCATGCCAGCCGTGGGCTTGA 397 g9066.t1_mm99049_mat2 CTACTGGATCCATCAAATGGCCGGAAGCAGCATGCCAGCAGTGGGCTTGA 397 g10270.t1_mn11037_mat2 CTACTGGATCCATCAAATGCCTGGAAGCAGAATGCCAGCAGTGGGTTTGA 400 ******************* * ******** ******** ***** ****

g815.t1_mn12262_mat2 TTCCTCAGCAGTTGGGCGTGCCTGTTCACTCTCAATTGCAATCTGTTCTT 447 g9066.t1_mm99049_mat2 CCGCTCAGCAGTCGGGCTTGCATGTTCCCTCTCAATCGTCCTCTGCTCTT 447 g10270.t1_mn11037_mat2 TCTCTCAGGAGCCGGGTTTGGCTGCGCCTTCTCAACCGTCTTCTACTTTT 450 ***** ** *** ** ** * ****** * *** * **

g815.t1_mn12262_mat2 GGACAGTCGGTCCAGGCTTCTGCTTCTACAGTCTTCCCTGCTCCCGCTGT 497 g9066.t1_mm99049_mat2 GGACAGTCGGTCCAGGCTTCTGCTTCCACAGTCTTCACAGCTCCTTCTGC 497 g10270.t1_mn11037_mat2 GATCGTTTGGTTCAGTTTCCTGTCTCCACAGTTC---CTGCTTCCACGGT 497 * * * *** *** * *** ** ***** * *** * * *

g815.t1_mn12262_mat2 TCTCCAACCTTCGAGTGCTGCACAGAATCCACCTCCAACTCAAGCTGGAT 547 g9066.t1_mm99049_mat2 TCTCCAACCTTCGAATGCGGCACAAAATCCACCTCCACCTCAAGCTGAAT 547 g10270.t1_mn11037_mat2 TTTGCAGCCTTCCAATGCTGCACAGGGACCATATCAGCCACAGTTTGATT 547 * * ** ***** * *** ***** *** ** * ** ** *

g815.t1_mn12262_mat2 CCATTCATCTGACCAACTCGACTCAATCG--GCCGATCCCCAGCTTCAGA 595 g9066.t1_mm99049_mat2 CTGGTCATCAGACCAACTCGACTC--TTGATGCTGATCTACAGCTCCAGA 595 g10270.t1_mn11037_mat2 CCAGTCATCAAATCATCTCTACTT--CAGCAGCCGATACTGAGCCCCAAA 595 * ***** * ** *** *** * ** *** *** ** *

g815.t1_mn12262_mat2 GCACGACTCGTCCTAGCAGTCCCAGCACCCCTACTAATCAAAGTGCTTCT 645 g9066.t1_mm99049_mat2 GCACAACTCGTCC------CAGCACCCCTACGGAGCAAAGTGCTTCT 636 g10270.t1_mn11037_mat2 GCACAACTCG------605 376

**** ***** g815.t1_mn12262_mat2 CGCAAGAGACCGATCGTAGACATCTCAGGAAGCCAGGAAGTTTCAGGCCC 695 g9066.t1_mm99049_mat2 CGCAAGAGACCGATCGCAGACATCTCGGGAAGCCAGGAAGTTTCAGATCC 686 g10270.t1_mn11037_mat2 ------

g815.t1_mn12262_mat2 TAACCCAGGCTCTGACGGGAACGGTAGCGCTCAACCAAGCGGCCCTCCTG 745 g9066.t1_mm99049_mat2 GGACCCAGGCTTTAATGCCGACGAGGGTGT--ATCC--GCGACC-----A 727 g10270.t1_mn11037_mat2 ------

g815.t1_mn12262_mat2 CTGAGGGCGATGTCGAGCACATACCAGTGGCGGATACTGGTTCGGTAGGA 795 g9066.t1_mm99049_mat2 GTGAGGGCAACGTCGAGCAAACGCCAGTGGCGGGTACTGGTTCGGTAAAA 777 g10270.t1_mn11037_mat2 ------

g815.t1_mn12262_mat2 AGCTCGACCGAAAGGCCTGCTAAGAAGGTCAAGAGGGC-AAAAAAGCCAA 844 g9066.t1_mm99049_mat2 GGCTCGATCGAAAAACCTGCCAAGAAGATCAAAAGGGC-AAAAAATCCAA 826 g10270.t1_mn11037_mat2 ------GCCTGCCAAGAGGGTGAAGAGGGCCAAGAGAGCCAA 641 ***** **** * * ** ***** ** * * **** g815.t1_mn12262_mat2 AGAACAACGATGGCGTTCCTCGTCCTTCGAACTCTTGGATCCTGTATAGA 894 g9066.t1_mm99049_mat2 AGAGAAACGATGGCGTTCCTCGCCCTTCGAACTCTTGGATTCTGTACAGA 876 g10270.t1_mn11037_mat2 -GAACAACAATGGTGTTCCCCGTCCTTCAAATGCTTGGATCCTGTACAGA 690 ** *** **** ***** ** ***** ** ******* ***** *** g815.t1_mn12262_mat2 ACAGCTCATCGTATGGAGTTTGCAGAGAAGGAGCCGACATTGGACAACTG 944 g9066.t1_mm99049_mat2 ACTGCCCACCGTATGGAGTTTGCCGAGAAGGAGCCCGAAATGGACAACTG 926 g10270.t1_mn11037_mat2 ACAGCTCGCCGTAGGGATTTTGCCGACGTCGCTCCCAGTACAGACAACTG 740 ** ** * **** *** ***** ** * ** ********

Mn_MAT2_20F: AGCATGGC g815.t1_mn12262_mat2 CAGCCTCTCTAACTCGTTCCCGTTAGCGAAAGTTATTGCCAAAGCATGGC 994 g9066.t1_mm99049_mat2 CAGCCTCT------CGAAAGTTATTGCCAAAGCATGGC 958 g10270.t1_mn11037_mat2 CAACCTCT------CAAAAATTATTGCCCAAGCATGGC 772 ** ***** * *** ******** *********

Mn_MAT2_20F: ACAATGAGCC g815.t1_mn12262_mat2 ACAAGGAGCCCGCTGATGTCAAGGCGTACTGGAAGCAGAAAGAAAAGGAA 1044 g9066.t1_mm99049_mat2 ACAAGGAACCCGCAGATGTCAAGGCGTACTGGAAACAGAAAGAAAAGGAA 1008 g10270.t1_mn11037_mat2 ACAATGAGCCCGCTGATGTCAGGGCGTATTGGAAGCAGAAGGAGAGGGAA 822 **** ** ***** ******* ****** ***** ***** ** * **** g815.t1_mn12262_mat2 GTCCGGGATGAGCACAGGAGGCTACATCCAGCTTATAAGTACGCCCCAAC 1094 g9066.t1_mm99049_mat2 GTCCGGGATGAGCACAGGAGGCTACATCCAGATTACAAGTACGCCCCAAC 1058 g10270.t1_mn11037_mat2 GTCCGTGACGAACACAAGCGGCTTCACCCAGGTTATAAATACGCACCGAC 872 ***** ** ** **** * **** ** **** *** ** ***** ** **

Mic_MAT2_198F: CGAAGCGYGAGRCGAAG g815.t1_mn12262_mat2 AGCCCCGAAGCGTGAGACGAAGAAGCTAGCTCGGAAACCTCGCCAACCCG 1144 g9066.t1_mm99049_mat2 AGCCTCGAAGCGTGAGACGAAGAAACCAGCGCAGAAATCTCGCCAACCCA 1108 g10270.t1_mn11037_mat2 TGCCCCGAAGCGCGAGGCGAAGAAGCCAGCTCGGAAACCTCGCGAACCCA 922 *** ******* *** ******* * *** * **** ***** ***** Mn_MAT2_3347F: TCGCCGCTATCCCCACT g815.t1_mn12262_mat2 AGGCCGTCGCTACCCCAACTGTCGAGTTTAGATCGCCTGACGTTGCGCAA 1194 g9066.t1_mm99049_mat2 AGGTCGCCGCTATCCCCACTGCCGAGTTCGGAACGTCTGACGTTGCGCAA 1158 g10270.t1_mn11037_mat2 AGGTTGCCCCTACCCTTGCGGTCGAGCTGGGAATATCTGCGGTTGCGCAA 972 *** * * *** ** * * **** * ** *** ********* g815.t1_mn12262_mat2 AGCCCAGCAAAGGAGATTGTTGAAAGCGAGATCTTTTCTACCGAGAGCAC 1244 g9066.t1_mm99049_mat2 AGCCAAGTAAAGGAGACGGTTGAGAGTGAGATCATTCTCACCGAGAGCAC 1208 g10270.t1_mn11037_mat2 AGCCCAACAAAGACTATCGTTCAGGACGAGATCATCTCCTTTAAGACTGC 1022 **** * **** * *** * ****** * *** * g815.t1_mn12262_mat2 TGATTCTGCAGACAAGCTTGCGTCTACTGAAGCTCTTGGCAATGTACCAA 1294 g9066.t1_mm99049_mat2 TGATTCTGCCGACAAGCTTGCGTCTGCTGATGCCCTTGGCGATGTATCAA 1258 g10270.t1_mn11037_mat2 TGATTCTACCGACAAGTTTGTATCTACCGAAGCCATTGGCAACATGTCAA 1072 ******* * ****** *** *** * ** ** ***** * * *** 377

g815.t1_mn12262_mat2 CACCTCAAGAGGTCACAAGCATTGCGTTGTTCAGCCCGAGCATTGGCGTC 1344 g9066.t1_mm99049_mat2 CACCTCAAGAGGTCACAAACATTGCGTTGTTTAGCCCGAGCATTGACGTC 1308 g10270.t1_mn11037_mat2 CACTGCAAGAGGTCACCAATATTGCTTCGGTCAGCCCGATCATTGACGTT 1122 *** *********** * ***** * * * ******* ***** *** g815.t1_mn12262_mat2 CAACTCCCATCACCGCAGCCAGCTGCTGCTCTCAAGTCCGACAAGTCCGA 1394 g9066.t1_mm99049_mat2 CAACTACCATCACCACAGCCAGATGCCACTCT------AAC--GTCCGA 1349 g10270.t1_mn11037_mat2 CAACTCCCGTTGCCGCTATCAAATGCCGCTCT------CCAGATTGA 1163 ***** ** * ** * ** *** **** * * ** g815.t1_mn12262_mat2 TCAGGCCGCCGCATCCTCACTTGCGCCGGA------GGATCCGGAATCCA 1438 g9066.t1_mm99049_mat2 TCAGACCGCCACATCCTCACTTGCGACAAA------GGACCAGCAATCCG 1393 g10270.t1_mn11037_mat2 TCACGCCACCGCAAACTCACTCGCACCAGCACCAGTGGACCAGGAACCCG 1213 *** ** ** ** ****** ** * *** * * ** ** g815.t1_mn12262_mat2 GTCTCTTCGACTATATTAACGAGTATCTAGACGCCAATCCAACCATTGAT 1488 g9066.t1_mm99049_mat2 ATCTCTACGATTATATCACCGAATATCTGAACGCCAATCCAGCCATCGAT 1443 g10270.t1_mn11037_mat2 GTGAACTCAATCGCATTACCGATTTCTTGAACTCCAATTCAATCAATCAT 1263 * * * ** * *** * * ** ***** ** ** ** g815.t1_mn12262_mat2 ATTCTGGCCGAAGACTTTGTCATAGGCACCAACGACATCACTGCCATCAT 1538 g9066.t1_mm99049_mat2 ATCCTGGCC------ACTAACGACATTCCTGTCGTGAC 1475 g10270.t1_mn11037_mat2 GTCCTGGCTTACAACGCTGACGCTGTCCCTAACTGCATCTCTGCCGCCAT 1313 * ***** * *** *** *** * * g815.t1_mn12262_mat2 TGACACAACCACTGACACTCCCGCCGACCTTTCTCCAACGTCTCACGCCA 1588 g9066.t1_mm99049_mat2 TG------CCACTG------CCACCGACCTTTCTCCCACGTCTCACGCCA 1513 g10270.t1_mn11037_mat2 CGA------CACAG------CCACTGACGTATCCTCCATGTTTGGCGCCA 1351 * *** * ** * *** * ** * * ** * *****

Mic_MAT2_676R: ACACTCAAGAYCAGAGCTT g815.t1_mn12262_mat2 CTCCTGCCGACACTCAAGATCAGAGCTTCTTAGATAGTGGACTGTTTCCA 1638 g9066.t1_mm99049_mat2 CTCCTGCCGACACTCAAGATCAGAGCTTCTTGGCCCGTGAAGTGTTCCCA 1563 g10270.t1_mn11037_mat2 TTCCTGTCAACACTCAAGACCAGAGCTTATTGGACTTTGGGAAACTCCTG 1401 ***** * ********** ******** ** * ** * * g815.t1_mn12262_mat2 GAAGCCAACGCCACCGGTGAAATCATAGACAGCATT---GCCACCACCTT 1685 g9066.t1_mm99049_mat2 GCAGTCAACGCCACCAGTGAATTCACAGACCGCATT---GCCGCCATCCT 1610 g10270.t1_mn11037_mat2 GAAGGGAACACCCCCAATGAACCCACAGACAGCCTTCCCACCACCACCTT 1451 * ** *** ** ** **** ** **** ** ** ** *** * *

Mn_MAT2_3871R: CATGGGCAGCGACTCCA MAT2_727R: CGACA g815.t1_mn12262_mat2 TCTCGTGGGCACCGATACCACAAACATATTCCCTGACAGCAACGCCACCA 1735 g9066.t1_mm99049_mat2 CCTCATGGGCAGCGACTCCACAAGCATATTTCCCAACAGCAACGCCACCA 1660 g10270.t1_mn11037_mat2 TGTCATGGGCAGCGACGCTACAAGCATGTTTTCCGACA------1489 ** ****** *** * **** *** ** * ***

Mn_MAT2_727R: ACAGCACGACAG g815.t1_mn12262_mat2 TCATAATTCCCGACGATAGCACGACCGACATGTTCCCGCCTGTGGACCTG 1785 g9066.t1_mm99049_mat2 GCACAATTCCCGAAAACAACACGGTTGCCATGTTGCCGCCTGTGGACCTG 1710 g10270.t1_mn11037_mat2 ------ACAGCACGACAGCTATGCCCCTGCCTATCGACCTG 1524 * * **** * *** * **** * ****** g815.t1_mn12262_mat2 CCATCTAGTGGCCCAGCCGACACGTCTTCGCCGACAGTGGACGTTGATGA 1835 g9066.t1_mm99049_mat2 GCATCTAGCGATCCAGCCGGCACGTCTTCGGCGACATTGGACATTGATGA 1760 g10270.t1_mn11037_mat2 GCATCTGGCGATGCAGTCGACATAACTTCG--GGT-TCGGCCTTTGATGA 1571 ***** * * *** ** ** ***** * ** * ******* g815.t1_mn12262_mat2 GTTCATCAATACAGACATGTTCCAGGATCACCTCTCTACTAGCTTGTCTG 1885 g9066.t1_mm99049_mat2 GTTCATCAACACAGAAATGTTTCAGGACCACCCCTCCACCAGCTTGTCTG 1810 g10270.t1_mn11037_mat2 GTTCATCAACACCGAAATGTTTCAAGACCACCCCTCTATCAACTTTTTGA 1621 ********* ** ** ***** ** ** **** *** * * *** * g815.t1_mn12262_mat2 TGGATTTCGCGGCCGTCGACGAGGAGTCGGAGTTTGACTTTGACTTTGCA 1935 g9066.t1_mm99049_mat2 TGGATTTCTCCGCCGCCAACGAGGAA------TTTGATTTTGACGTTGCA 1854 g10270.t1_mn11037_mat2 TGGACTCCTTTGTCAATGGCGATGAG------TTCAACCTTGACTTTAAA 1665 **** * * * * *** ** ** * ***** ** *

378

g815.t1_mn12262_mat2 AACTAC 1941 g9066.t1_mm99049_mat2 AACTAC 1860 g10270.t1_mn11037_mat2 AACTAC 1671 ******

379

Appendix 5.8 Alignment of M. majus 99049 MAT-region coding sequences with those of M. nivale 11037, with primer loci indicated for APN2

g816.t1_mn12262_apn2 ATGGGCATTCGCATCACTTCGTGGAATGTAAACGGCATTCGCAATCCATT 50 g9067.t1_mm99049_apn2 ATGGGCATTCGCATCACTTCGTGGAATGTCAACGGCATTCGCAATCCGTT 50 g10269.t1_mn11037_apn2 ATGGGCATTCGCATCACTTCGTGGAATGTCAACGGCATTCGCAATCCGTT 50 ***************************** ***************** **

g816.t1_mn12262_apn2 TGGGTACCAGCCATGGCGGGAGACACGGAGTTTCGAGGTGAGTGTGCTCT 100 g9067.t1_mm99049_apn2 TGGGTACCAGCCATGGCGGGAGACACGGAGTTTCGAG------87 g10269.t1_mn11037_apn2 CGGGTACCAGCCATGGCGGGAGACGCGGAGTTTCGAG------87 *********************** ************

g816.t1_mn12262_apn2 GCTTTGCCTGTCAGGACCTGTCGAAACAGTTGGCTCACCCTGT-TGTCCA 149 g9067.t1_mm99049_apn2 ------TTGGCTCACCCTGC-CGTTCA 107 g10269.t1_mn11037_apn2 ------CTCACCCCGGGCAACCA 104 ******* * **

g816.t1_mn12262_apn2 GTCCATGTTCGACATTCTCGAAGCAGACATTGTCGTCATGCAGGAGCTCA 199 g9067.t1_mm99049_apn2 GTCCATGTTCGATATTCTCGAGGCAGACATTGTCGTCATGCAAGAGCTCA 157 g10269.t1_mn11037_apn2 GTCCATGTTCGACATCCTCGAGGCAGACATTGTCGTCATGCAAGAGCTCA 154 ************ ** ***** ******************** *******

g816.t1_mn12262_apn2 AGATCCAACGCAAGGATCTTCGAGATGACATGGTCCTCGTGCCCGGATGG 249 g9067.t1_mm99049_apn2 AGATACAACGCAAGGATCTCCGAGACGACATGGTCCTCGTGCCCGGATGG 207 g10269.t1_mn11037_apn2 AGATCCAACGCAAGGATCTTCGGGATGACATGGTCCTCGTGCCCGGATGG 204 **** ************** ** ** ************************

g816.t1_mn12262_apn2 GACGTATACTTCAGCTTGCCCAAGCACAAGAAAGGCTATTCGGGTGTAGC 299 g9067.t1_mm99049_apn2 GACGTGTATTTCAGCTTGCCCAAGCATAAGAAAGGCTATTCGGGTGTGGC 257 g10269.t1_mn11037_apn2 GACGTGTACTTCAGCTTGCCCAAGCACAAGAAAGGCTATTCGGGTGTGGC 254 ***** ** ***************** ******************** **

g816.t1_mn12262_apn2 CATCTACACAAGAAACTCTGTCTGCTCACCCATCCGAGCAGAAGAAGGCA 349 g9067.t1_mm99049_apn2 TATCTACACAAGAAACTCCGTCTGCTCCCCCATCCGAGCAGAAGAAGGCA 307 g10269.t1_mn11037_apn2 TATTTACACAAGAAACTCCGTCTGCTCCCCCATCCGAGCAGAAGAAGGCA 304 ** ************** ******** **********************

g816.t1_mn12262_apn2 TCACCGGCTGCCTGACTTCCCCAGGCAGCACCACTGCCTACCGAGATCTC 399 g9067.t1_mm99049_apn2 TCACCGGCTGCCTGACTGCCCCAGGTAGCACCACTGCCTACCGAGACCTG 357 g10269.t1_mn11037_apn2 TCACCGGCTGCCTGGCACCTCCGGGCAGCACCATTGCCTACCGAGATCTT 354 ************** * * ** ** ******* ************ **

g816.t1_mn12262_apn2 CCAGAAGACCAGCAGATTGGTGGCTATCCTCGGCCCGGGCAGCTTCCTGG 449 g9067.t1_mm99049_apn2 CCAGAAGATCAGCAGATTGGTGGCTATCCTCAGCCTGGGCAGCTTCCTGG 407 g10269.t1_mn11037_apn2 CCAGAAGACCAGCAGATTGGTGGCTACCCTCAGCCTTGGCAGCTTCCTGC 404 ******** ***************** **** *** ************

g816.t1_mn12262_apn2 AGACGTCGACGATGTCGTTCTCGACAGTGAGGGTCGTTGCGTCATTCTCG 499 g9067.t1_mm99049_apn2 AGATGTCGATGATGTCGTTCTCGACAGTGAGGGTCGTTGCGTCATTCTCG 457 g10269.t1_mn11037_apn2 AGACGTCGAGGACGTCGTTCTCGACAGCGAAGGTCGTTGCGTCATTCTCG 454 *** ***** ** ************** ** *******************

g816.t1_mn12262_apn2 AATTCCCTGCATTTGTCATCATTGGCACCTACAGCCCAGCAAACAGCGAC 549 g9067.t1_mm99049_apn2 AGTTCCCTGCGTTTGTCATCATCGGCACCTACAGCCCAGCAAACAGCGAC 507 g10269.t1_mn11037_apn2 AGTTCCCTGCTTTTGTCCTAATAGGCACCTACAGTCCAGCAAATAGCGAC 504 * ******** ****** * ** *********** ******** ******

g816.t1_mn12262_apn2 GGCTCGCGGGACGACTTCAGAATCGGCTACATGGGGGCACTGGACGTGCG 599 g9067.t1_mm99049_apn2 GGCACGCGAGACGACTTCAAAATCGGCTACCTGGGGGCACTGGACGTGCG 557 g10269.t1_mn11037_apn2 GGCACGCGGGATGACTTTAAAGTCGGCTACCAAGGGGCCTTGGACATGCG 554 *** **** ** ***** * * ******** ***** ***** ****

g816.t1_mn12262_apn2 GATACGCAACCTCACCGCAATGGGGAAGCAGGTCATCCTGACAGGCGATC 649 g9067.t1_mm99049_apn2 AATACGCAACCTCACTGCAATGGGGAAGCAGGTCGTCCTGACAGGCGACC 607 g10269.t1_mn11037_apn2 GATACGCAACCTTGCCGCGATGGGAAAGCAGGTTGTACTGACTGGCGATC 604 380

*********** * ** ***** ******** * ***** ***** * g816.t1_mn12262_apn2 TCAACGTGATACGAGACGTGATTGACGCTGCTGGCCTGCATGACAGACTC 699 g9067.t1_mm99049_apn2 TCAACGTGATACGAGACGTGATTGATACTGCAGGCCTGCACGAGAGACTC 657 g10269.t1_mn11037_apn2 TCAATGTGATACGAGATGTGATTGACACTGCTGGCCTGCACGAGAGACTC 654 **** *********** ******** **** ******** ** ****** g816.t1_mn12262_apn2 AGAAAAGAAGGCATGACCATAGACGACTACTTTACCATGCCGTCGCAGCG 749 g9067.t1_mm99049_apn2 CGAAAAGAAGGCATGACCATGGACGACTACTTTACCATGCCGTCGCGGCG 707 g10269.t1_mn11037_apn2 AGGAAAGAAGGCATGACCATGGACGACTACTTCACCATGCCTTCGCGGCG 704 * ***************** *********** ******** **** *** g816.t1_mn12262_apn2 CGTCTTCACCCAGCTGGTCATTGGGGCCAAGGTCAAAGGCGGCCGGGACG 799 g9067.t1_mm99049_apn2 TGTTTTCACCCAGCTGGTCATTGGGGCCAAGGTCAAAGGCGGCCGGGACG 757 g10269.t1_mn11037_apn2 CGTCTTCACCCAGCTGGTCATCGGGGCCAAGGTCAGAGGCGGCCGGGACG 754 ** ***************** ************* ************** g816.t1_mn12262_apn2 AAGGTCGAGAAAAGCCGATCTTGTGGGACCTCGGGCGGCTCTTCCATCCT 849 g9067.t1_mm99049_apn2 AGGGTCGAGAAAAACCGGTCTTGTGGGACCTTGGGCGGCTCTTTCATCCT 807 g10269.t1_mn11037_apn2 AAGGTCGAGAAAAGCCAGTCTTGTGGGACCTGGGGCGGCTGTTCCATCCC 804 * *********** ** ************* ******** ** ***** g816.t1_mn12262_apn2 GATCGCCAGGGCATGTACACATGTTGGAACACCAAGACAAACTCGAGACC 899 g9067.t1_mm99049_apn2 GATCGCCAGGGCATGTATACATGTTGGGACACCAAGAAGAACTCGAGACC 857 g10269.t1_mn11037_apn2 GACCGCCAGGGCATGTACACATGCTGGGACACCAAGAAGAACTCGAGACC 854 ** ************** ***** *** ********* *********** g816.t1_mn12262_apn2 GGGAAACTACGGCAGTCGCATTGACTTTGTCCTATGCAGTAATGGCATAA 949 g9067.t1_mm99049_apn2 GGGAAACTATGGGAGTCGCATTGACTATGTCCTATGCAGCAATGGCATGA 907 g10269.t1_mn11037_apn2 AGGAAACTTTGGTAGCCGTATTGACTATGTCTTGTGCACCAACGGCATGA 904 ******* ** ** ** ******* **** * **** ** ***** * g816.t1_mn12262_apn2 AAGACTGGTTCGCGACCTCCAACATCCAGGAAGGACTGATGGGTTCAGAC 999 g9067.t1_mm99049_apn2 AAGACTGGTTCGCGACCTCTGACATCCAGGAAGGACTGATGGGTTCAGAC 957 g10269.t1_mn11037_apn2 AAGACTGGTTCACGGCCTCCGACATCCAGGAAGGCTTAATGGGCTCGGAC 954 *********** ** **** ************* * ***** ** *** g816.t1_mn12262_apn2 CATTGCCCAGTATTTGCTGTCATGAGCGACATTGTCTCACTAGGTGGTAA 1049 g9067.t1_mm99049_apn2 CATTGCCCAGTATTTGCTGTAATGAACGACATTGTCTCACTAGGTGGCAA 1007 g10269.t1_mn11037_apn2 CACTGTCCAGTATTTGCTGTCATGAAAGACGTCATCTCACTAGATGGCAA 1004 ** ** ************** **** *** * ********* *** ** g816.t1_mn12262_apn2 AGACACTCACCTGCGCGATATCATGAACCCTGCTGGCACCTTTGAGAATG 1099 g9067.t1_mm99049_apn2 AGACACTCACCTGCGCGACATCATGAACCCCGCTGGTACCTTTGAGAATG 1057 g10269.t1_mn11037_apn2 AGATATCCACCTGCGAGACATTATGAACCCGATTGGCACATTTGAGAATG 1054 *** * ******** ** ** ******** *** ** **********

Mic_APN2_32R: GAACCTTCTCCCACTATCAGC Mn_lyase_838F CGCCCAAGAACCTTCTCC g816.t1_mn12262_apn2 GCCAGAGACTACAGGAATGGTCACCCAAGAACCTTCTCCCACTATCAGCA 1149 g9067.t1_mm99049_apn2 GTCAGAGAATACAAGAGTGGTCGCCCAAGAACCTTCTCCCACTATCAGCA 1107 g10269.t1_mn11037_apn2 GCCAGAGACTACAGGAGTGGTCGCCCAAGAACCTTCTCCCACTATCAGCC 1104 * ****** **** ** ***** ************************** g816.t1_mn12262_apn2 AAGCTGATCGCAGAGTTCGACCGCCGGCAGAGCATCAAAGCCATGTTCTT 1199 g9067.t1_mm99049_apn2 AAGCTGATCGCAGAGTTCGACCGCCGGCAGAGCATAAAAGACATGTTCTT 1157 g10269.t1_mn11037_apn2 AAACTGATTGCAGAGTTCGACCGTCGGCAGAACATCAAGGACATGTTCTT 1154 ** ***** ************** ******* *** ** * ********* g816.t1_mn12262_apn2 CAAAAAGCCTGCGTCGTCTACCTCAAGCCAGACTGTGCCTTTGACGAATG 1249 g9067.t1_mm99049_apn2 CAAAAAGCCAGCGTCGTCCACATCAAGCCAGGCTGTACCTTTAACGAATG 1207 g10269.t1_mn11037_apn2 CAAAAAGCCAACGTCGCTGACCTCAAGCCAATTTGCGCCTCTGACAAATA 1204 ********* ***** ** ******** ** *** * ** *** g816.t1_mn12262_apn2 GTCCGGCGGGTGCTGACGCGCCCACAACCATTCCGTCCACACCACAACAA 1299 g9067.t1_mm99049_apn2 GCTTGGCCGGNGC---CGTGCCCACAGCCACTCCGCCTACATCACAACAA 1254 g10269.t1_mn11037_apn2 GTTTGGAGGTGGCTGCCATCCCCACACTCTCAGCGACCGCACCAGGACAA 1254 * ** * ** * ****** * ** * ** ** ****

381

g816.t1_mn12262_apn2 ATCACCGATACCATGCCGTGTTTTACCCACACGGAATCATCAGCAGCTGC 1349 g9067.t1_mm99049_apn2 ATCACTGATACCACGCCGTGTCTTAGCCACGTGGAATCACCAGCAGCTGC 1304 g10269.t1_mn11037_apn2 TTTGCTGGTAGCACGCCGGGTCTTGCTCACGTGGAATCACCCACGGCTTC 1304 * * * ** ** **** ** ** *** ******* * * *** * g816.t1_mn12262_apn2 CCCGAGGTCATCAAACGGAGAACTCAAACGCACACCCTCAGGCTCGGTGA 1399 g9067.t1_mm99049_apn2 TCCAACTTCATCAAGTGGAGGACTCAAACGCACACCATCCGGCTCGGTGA 1354 g10269.t1_mn11037_apn2 CCCAAGGCCATCAAATGGAGGGCTCAAACGCACGCCGTCTGGCTCGGCGG 1354 ** * ****** **** *********** ** ** ******* * g816.t1_mn12262_apn2 CTCCACAGAGGCCAGCCAAAAAGACCAAAGCAGCCTTGCCAAGAGAATCA 1449 g9067.t1_mm99049_apn2 CTCCACAGAGACCACCGAAAAAGAGCAAAGCAGCCTTGCCGAAAGAATCA 1404 g10269.t1_mn11037_apn2 CTCTACAGAAGCCAGCCAAGAAGACCAAAGCAGCGTTACCCCAAGAAACA 1404 *** ***** *** * ** **** ********* ** ** **** ** g816.t1_mn12262_apn2 TCAAGTAAGGGGGGCCAGAAGAGCCTGATGGGATTCTTCAAGCCCAAGGC 1499 g9067.t1_mm99049_apn2 TCAAGCAAGGGGGGGCAGAAGAGCCTGATGGGATTCTTCAAGCCCAAGGC 1454 g10269.t1_mn11037_apn2 GCAAGTAAAGGGGGGCAGAAGAGCCTGATGGGATTCTTCAAGCCCAAGAT 1454 **** ** ***** ********************************* g816.t1_mn12262_apn2 TACAGCGACCTCATCGGCTGCTACTGCGCCCACCAATCTGCAATCAGAAG 1549 g9067.t1_mm99049_apn2 CACAGCGACCTCATCAACTGCCACCGCGCTTACCAATATGCACTCAGAAG 1504 g10269.t1_mn11037_apn2 AACAACAACATCAGCGGCTGCCACCACGCCCACATTTGTGCAATCAGAAA 1504 *** * ** *** * **** ** *** ** * **** ****** g816.t1_mn12262_apn2 AGACAACGGCGAGCCCGCCGTCGAGCTCTGCTTTGTCAAATCTCGATGAG 1599 g9067.t1_mm99049_apn2 AGACAGTGGCGAGCCCGCCGTCGAGCTCTGCTTTGTCAACGCTTGATGAG 1554 g10269.t1_mn11037_apn2 ACACAGCGGCGAGCCCGCCGCCGAGTTCTGCCCTGTCGGAGCTTGACGAG 1554 * *** ************* **** ***** **** ** ** *** g816.t1_mn12262_apn2 CACCACCAACA---ACAACAAGTTTCGCCAAAACAAGACCCGCCTGTCCC 1646 g9067.t1_mm99049_apn2 CACTACCAACAGCAACAACAAGTTTCACCAACACAAGACCCGCCTTTCCC 1604 g10269.t1_mn11037_apn2 CACCAAAAGCA---GCAACAAGTCTCGCCGAAACAGGACGCGCCTGTCCC 1601 *** * * ** ******** ** ** * *** *** ***** **** g816.t1_mn12262_apn2 TCCAAGCCCTGAAAGGGTCATCGATCAGGTGCAGTCCCGCGAGACATGGT 1696 g9067.t1_mm99049_apn2 TCCCAGCCCTGAAAAGGTCATCGATCAGGTGCAGTCCCGCGAGACATGGT 1654 g10269.t1_mn11037_apn2 TCCCAGCCCCGAAAGGATCATTGACCAGGTGCAGTCCCGCGAGACGTGGT 1651 *** ***** **** * **** ** ******************** **** g816.t1_mn12262_apn2 CGAAGCTCCTGGGCAAGAGGGTCGTGCCCAGGTGCGAGCACGGCGAGGAC 1746 g9067.t1_mm99049_apn2 CGAAGCTCCTGGGCAAGAGGGTCGTGCCCAGGTGCGAGCACGGCGAGGAC 1704 g10269.t1_mn11037_apn2 CGAAACTTCTAGGCAAGAGGGTCGTGCCGAGATGCGAACACGGCGAGGAC 1701 **** ** ** ***************** ** ***** ************ g816.t1_mn12262_apn2 TGCATCAGCCTAGTCACCAAGAAGGCCGGCTTCAACAA------1784 g9067.t1_mm99049_apn2 TGTATTAGCCTGGTTACCAAGAAGGCCGGCTTCAACAA------1742 g10269.t1_mn11037_apn2 TGCATCAGTCTGGTTACCAAGAAGGCCGGCTTCAACAAGGGTAAGCAATA 1751 ** ** ** ** ** *********************** g816.t1_mn12262_apn2 ------g9067.t1_mm99049_apn2 ------g10269.t1_mn11037_apn2 TTTCCTAGTACCCACAGGACCTAGTCGCCTCATGAAGAATGAGGCTGGTG 1801

g816.t1_mn12262_apn2 ------g9067.t1_mm99049_apn2 ------g10269.t1_mn11037_apn2 CGATTTCAATGCAAACAATTGAGGCTGACGTATGCTCGACATCTGAAGGA 1851

g816.t1_mn12262_apn2 ------AGGCCCTTCCGGTGAGAAGGA 1805 g9067.t1_mm99049_apn2 ------AGGCCCTTCGGGTGAGAAGGA 1763 g10269.t1_mn11037_apn2 CGCTCGTTCTTCATCTGCCCACGACCTATAGGCCCTTCGGGCGAAAAGGA 1901 ********* ** ** ***** g816.t1_mn12262_apn2 AAAAGGGACAGAGTGGCGCTGCGGGACGTTTATATGGAGCAGTGACTGGA 1855 g9067.t1_mm99049_apn2 AAAGGGAACAGAGTGGCGCTGCGGGACGTTTATATGGAGCAGCGACTGGA 1813 g10269.t1_mn11037_apn2 GAAAGGCACAGAGTGGCGCTGTGGGACATTTATATGGAGCAGTGACTGGA 1951 ** ** ************** ***** ************** ******* 382

g816.t1_mn12262_apn2 CTAGCAAGAGC 1866 g9067.t1_mm99049_apn2 CTAGCAAGAGC 1824 g10269.t1_mn11037_apn2 CTAGCAAAAGC 1962 ******* ***

383

Appendix 5.9 Observed ascospore dimensions

Isolate Length (µm) Width (µm) Q* Number of Septa 16.8 4.8 3.5 3 24.0 4.8 5.0 3 14.4 4.8 3.0 1 14.4 3.6 4.0 1 14.4 4.8 3.0 3 99049 21.6 4.8 4.5 3 19.2 4.8 4.0 3 16.8 4.8 3.5 3 16.8 4.8 3.5 3 14.4 4.8 3.0 2 19.2 6.0 3.2 3 19.2 4.8 4.0 3 16.8 4.8 3.5 3 12.0 4.8 2.5 1 19.2 6.0 3.2 3 12045 16.8 4.8 3.5 3 16.8 3.6 4.7 3 19.2 2.4 8.0 3 16.8 3.6 4.7 3 19.2 3.6 5.3 3 14.4 3.6 4.0 3 12.0 3.6 3.3 1 14.4 4.8 3.0 3 16.8 4.8 3.5 3 16.8 4.8 3.5 1 12166 16.8 4.8 3.5 1 12.0 4.8 2.5 1 14.4 4.8 3.0 2 16.8 4.8 3.5 3 14.4 4.8 3.0 1 * Q = length / width

384

Appendix 5.10 Supplemental Results: Test of published mating type primers and redesigned universal primers

The MAT1_3149F and MAT1_3644R primer set designed to amplify MAT1-1-1 yielded a single band of the predicted size (300 bp) from two of the four isolates it was initially tested with. When one of these bands (from M. nivale isolate 10083) was sequenced in the forward direction, the resulting amplicon was queried against the GenBank database (BLASTx and

BLASTn) to assess its similarity to other MAT1-1-1 sequences in the database. The top match in both cases was the protein eukaryotic release factor 1 which has not been described as having a role in sexual reproduction. The MAT2_3515F and MAT2_4618R primer set failed to yield a single band with the isolates with which they were tested.

Two sets of redesigned primers, MAT1_3132F and MAT1_3303R, and MAT2_1404F and MAT2_1811R were also designed and tested, but failed to yield bands of the predicted size from M. nivale or M. majus. Another set of mating type primers, MAT1_86F and MAT1_891R, and MAT2_488F and MAT2_650R were also designed and tested. The MAT1_86F and

MAT1_891R primer set produced strong bands of the predicted size (805 bp), as well as larger bands of approximately 900 bp in two of the five isolates it was tested with. After optimization, single bands of approximately 1,500 bp and 600 bp were produced for the M. majus isolate

10096 and the M. nivale isolate 10102, respectively. Both were sequenced in the forward direction, but the resulting sequences were of poor quality, suggesting that multiple bands of approximately the same size had been produced by the primer.

The MAT1-2-1 primers MAT2_488F and MAT2_650R produced a single band of approximately 600 bp in all five of the isolates it was tested with. Although this band was larger than the predicted amplicon, it was sent for sequencing in the forward direction. The resulting 385

sequence had no matches in the GenBank database using the BLASTn algorithm, and the

BLASTx algorithm yielded only a weak set of matches. The closest match in the BLASTx search

(minimum e-value 0.26 across 24% of the query) was a hypothetical protein from Magnaporthe grisea, which in turn shared a high homology with a GATA transcription factor.

386

Appendix 5.11 Supplemental results: Identification of flanking genes in May 2011 99049 assembly

In tBLASTn searches against the original assembly of the M. majus genome, the putative match to SLA2 displayed a 68.9% identity and an e-value of 0.0 with the query. For MAT1-2-1, the putative match displayed a 31.5% identity and an e-value of 4e-06 with the query sequence from Stemphylium majusculum. For APN2, the putative match displayed a 53.3% identity and an e-value of 7e-119.

Primers were designed near the 3' end of the putative APN2 sequence and the 5' end of the putative SLA2 sequence, with the goal of amplifying the region spanning between these genes, which was hypothesized to include the putative MAT1-2-1 gene identified as described above. These SLA2 and APN2 primers, Mn_APN2_838F and Mn_SLA2_23R, were tested with four M. majus isolates, including the sequenced isolate 99049. Despite manipulating the PCR conditions as described, these primers failed to amplify a fragment of the predicted size (between

6-8 kb, based on the size of this region in other species).

387

Chapter 6 Infection Process of Microdochium majus and M. nivale

6.1 Introduction

6.1.1 The disease cycle The interaction between a given fungal pathogen and its host involves a series of distinct events known as the disease cycle (Agrios 2005). These events (Figure 6.1) include the dissemination of the infectious material (inoculum) to the potential host plant, the attachment to and penetration of the plant tissue, the infection and invasion of the host plant cells, and the growth of the pathogen that ultimately produces a fresh source of inoculum for subsequent infections. Depending on the identity of the pathogen, the inoculum may consist of hyphal fragments, conidia, sexual spores, or resting structures (e.g. sclerotia) (Agrios 2005), and the inoculum may be transferred to the host plant by diverse mechanisms including wind, water, insects, animals, or agricultural practices.

The series of physical processes that occur during the invasion of a plant by a pathogen are generally referred to as the infection process for a particular pathosystem. To invade the host plant, the pathogen may take advantage of existing openings, such as wounds created by insects or by human activity, or natural openings such as stomata. Alternatively, the fungus may directly penetrate into cells. Although it is possible for actively growing hyphae to infiltrate the plant tissue by simply growing through an existing opening, in many fungi, the infection process begins when conidia come into contact with the surface of a susceptible host plant. In some pathogens, following germination of the conidium, the tip of the emergent germ tube may differentiate into a specialized structure called an appressorium (Howard 1996), which provides the attachment necessary to drive a penetration peg either directly through the plant cuticle or

388

through an existing opening. When appressoria are not formed, penetration may be more heavily reliant on the enzymatic degradation of the plant tissue (Mendgen et al. 1996).

Once inside the plant, the fungus may infiltrate neighbouring cells, causing structural damage and robbing the host plant of resources. While some plant pathogens are biotrophs, requiring a living host (Agrios 2005), saprotrophs are capable of growing on dead tissue. Some pathogens exhibit a mixture of these behaviours by beginning their infection as biotrophs, but switching to a saprotrophic lifestyle after the surrounding plant cells die or are killed (Agrios 2005).

To absorb nutrients from their host plants, biotrophs may develop specialized structures called haustoria which penetrate the plant , but not its membrane (Mendgen et al. 2000).

The developing haustorium expands in shape, often developing “finger-like” projections which maximize the surface area between the fungus and the plant cell membrane for the purpose of nutrient acquisition from the plant cell (Heath and Skalamera 1997). Although several genera within the Xylariales include phytopathogenic and / or endophytic members (Brunner and Petrini

1992; Davis et al. 2003), few reports are available describing the infection processes of these species.

6.1.2 The infection processes of M. nivale and M. majus While conidia are important in the dissemination of many plant diseases, including other diseases of graminoid plants (e.g. Colletotrichum graminicola, the causal agent of anthracnose diseases of grasses (Khan and Hsiang 2006)), the conidia of Microdochium nivale sensu lato are not reported to initiate disease symptoms (Pronczuk and Messyasz 1991; Tronsmo et al. 2001).

Outbreaks of this pathogen appear to be mediated by the spread of soil-borne hyphae, as conidial germination and subsequent germ tube formation appears to be a prohibitively slow and non-

389

infective process (Pronczuk and Messyasz 1991). This unusual observation may be explained at least in part by the reduced levels of competition faced by M. nivale and M. majus during the cool temperatures in which they cause disease (Tronsmo et al. 2001). It may also be possible that the conidia are not directly infective, and may instead germinate in the soil; these species may then survive as saprotrophs, only behaving as pathogens when under favourable environmental conditions (Tronsmo et al. 2001). Alternatively, conidia may play a role in sexual reproduction by transferring genetic information between strains, performing a spermatizing function

(Webster and Weber 2007).

The infection process of M. nivale sensu lato has been studied on three turfgrass species

(Dahl 1934). On all host species described, fungal hyphae penetrated through stomata, rather than directly punching through the foliar surface (Dahl 1934). Penetration occurred within 3 days post-inoculation when hyphal inoculum was used. A study of the infection process of M. nivale on the cereal triticale (x Triticosecale, a cross of wheat, Triticum vulgare, with rye, Secale cereale) was recently reported (Dubas et al. 2011). By using fluorescent dyes to track the growth of the fungus following hyphal inoculation, these researchers found that hyphae grew directly into the stomata of the plants, and formed haustorium-like structures within the plant cells.

Haustoria are a typical feature of biotrophic fungi (Catanzariti 2011), and have not been previously reported for M. nivale. However, both studies on triticale were performed with the same strain of M. nivale that was originally isolated from rye in 2001, and although the authors state that this particular strain was highly virulent, it is unclear whether it was truly M. nivale, or actually M. majus.

On both rye and triticale, researchers have reported the presence of haustoria-like structures in infected tissues (Dubas et al. 2011; Zur et al. 2011). The presence of haustoria implies that 390

these pathogens are engaged in biotrophy, absorbing nutrition from the living host plant.

Microdochium nivale sensu lato has previously been described as saprotrophic, as it is capable of surviving on dead plant material without a living host for up to one year (Tronsmo et al. 2001). It is possible that M. nivale and M. majus are hemibiotrophs, exhibiting a short period of biotrophy on the living host before the host tissue is killed by degradative enzymes (Agrios 2005), but this needs to be confirmed by more extensive microscopic and physiological work.

6.1.3 Host specificity and infection success Microdochium nivale and M. majus are known to possess host preferences, in that M. majus has only been isolated from cereals whereas M. nivale colonizes both cereals and grasses

(Hofgaard et al. 2006; Ioos et al. 2004; Mahuku et al. 1998; Simpson et al. 2000; Smith 1983).

The ability of each species to preferentially colonize different hosts may be related to their differing abilities to overcome host defences or due to their ability to avoid initiating or inducing different levels or types of defensive responses in different hosts (Simpson et al. 2000).

Similarly, it may be possible that the physical processes of stomatal infiltration may differ between M. nivale and M. majus, either in general or when in contact with different hosts.

6.1.4 Objectives The purpose of this research was to determine whether the timing and mechanism of the infection processes of M. nivale and M. majus differ on different host plants, and to determine whether within-species variation might be observed for M. nivale isolates initially collected from either wheat (Triticum aestivum) or from a turfgrass species. Additionally, the infectivity of conidia and hyphal fragments was also investigated. Finally, the claim that specialized structures

391

such as haustoria are produced during infection was investigated on detached leaves of wheat and Kentucky bluegrass (Poa pratensis) by searching for the presence of these structures on inoculated leaves.

6.2 Materials and Methods

6.2.1 Plant culture Seeds of Triticum aestivum (red hard winter wheat) or Poa pratensis (Kentucky bluegrass) cultivar “common #1” were sterilized and pre-germinated before planting. The seeds were rinsed in sterile distilled H2O (SDW) for 30 s, then in 70% EtOH for 15 s before being sterilized in 1%

NaOCl and finally rinsed in SDW. Seeds of wheat were held in the sterilization solution for 1 min, while Kentucky bluegrass seeds were sterilized for 30 s. The seeds were then placed in Petri plates containing two sheets of autoclaved 7 cm-diameter filter paper (Whatman qualitative circles, grade 2; Fisher Scientific) moistened with SDW. After 48 hours, germinated seeds were planted in 500 mL Mason jars (Bernardin, Richmond Hill, ON, Canada) containing 50 mL (300 g) of rooting mix autoclaved twice (20 minute cycle at 121°C) at 24 hr intervals. The rooting mix contained 80% sand and 20% peat. Approximately 100 seeds of Kentucky bluegrass or 30 seeds of wheat were planted in each jar. The plants were maintained at 23°C for three weeks under high humidity and under constant fluorescent lights emitting 50 µmol/m2/s, and were watered with SDW as necessary.

6.2.2 Inoculum preparation To induce conidiation for spore inoculation, agar plugs were cut from the actively-growing margin of a colony, and one was placed in the centre of a fresh PDA plate. The plates were

392

incubated at room temperature in the dark for 48 hours before being exposed to UV light at room temperature. For isolates that readily produced conidia, conidiation was induced within 48 hours of exposure to UV light. Some isolates failed to produce conidia, and hence different isolates were used in the different experiments (Table 6.1). Spore suspensions were prepared by pipetting

1.5 mL of SDW into each plate of an actively sporulating colony, gently scratching with a sterile glass rod, then filtering this suspension through two layers of autoclaved muslin into a 1.5 mL plastic tube. The spore concentration was determined using a haemocytometer, and the final concentration was adjusted to 106 spores / mL using SDW.

For experiment A, an auger was used to cut agar plugs 5 mm in diameter from a two-week old colony of the isolate of interest growing on PDA. Hyphal inoculum was prepared by pouring

3 mL of SDW onto the surface of a non-sporulating two-week-old colony of the isolate of interest growing on PDA. The plates were scraped with a sterile glass rod, and the resulting hyphal suspension was diluted 1:10 with autoclaved water.

6.2.3 Inoculation Five separate experiments, denoted as A-E, were performed. A general summary of the types of inoculum and the number of isolates included in each experiment is available in Table 6.1, and a list of isolates used in each experiment is provided in Table 6.2. Both host plant species were inoculated with every fungal isolate used in each experiment. All plates for each host plant by isolate combination were prepared in triplicate. Non-inoculated controls were prepared using either agar plugs from fresh (non-inoculated) PDA plates (experiments A and B) or by spraying the leaves with SDW (experiments C-E). To ensure that they were viable, the inocula used in these experiments were also applied to fresh PDA plates and the growth was monitored.

393

Leaf blades were collected from the wheat and Kentucky bluegrass grown as described

(Section 6.2.1), and were cut into sections approximately 1 cm in length using a flame-sterilized scalpel. The leaf sections were placed in Petri plates containing two sheets of autoclaved 7 cm- diameter filter paper (Whatman qualitative circles, grade 2; Fisher Scientific) moistened with

SDW. Approximately 20 sections were placed in each plate, and each plate received a single type of inoculum Table 6.1). For hyphal inoculation in experiments A and B, a plug of PDA

3mm in diameter cut from the actively-growing region of a colony was placed face-down onto each leaf segment. The plug was removed after 24 hours. For experiments C-E, leaf blade sections were inoculated with approximately 200 µL of the hyphal or conidial suspension using a hand pump sprayer.

Immediately following inoculation, the plates containing the inoculated leaf sections were placed in a plastic bin and were incubated under the conditions described in Table 6.1. For the

"full light" condition, plates were placed in a transparent plastic box and incubated in the presence of a fluorescent light emitting 50 µmol/m2/s for 24 h each day. For the "darkness" condition, plates were placed in an opaque plastic box. To maintain high humidity, paper towels saturated with sterile water were placed along the bottom of the box and re-moistened as necessary.

6.2.4 Sample collection, staining, and scoring At each collection time point (Table 6.3), one leaf segment was removed at random from each of the plates. All three segments from a single treatment group were placed in 1.0 mL of acetic alcohol (1:3 glacial acetic acid: 95% ethanol) for 24 h to remove chlorophyll from the leaves. After 24 h the acetic alcohol was carefully poured off and fresh acetic alcohol was added.

394

After a second 24 h period, the acetic alcohol was removed and the leaf sections were placed onto a glass slide containing two drops (approximately 100 μL) of 0.05% trypan blue (w/v) in lactophenol (20% phenol, 20% lactic acid, 40% glycerine, and 20% water). The leaf sections were stained for 48 h before being soaked in lactophenol for 12-24 h to remove excess stain.

Finally, the segments were mounted in two drops of lactophenol and were covered by a glass coverslip. A layer of transparent nail polish was painted around the exterior of the coverslip to prevent movement or drying of the mounted material. Due to the destructive nature of this sampling protocol, different leaf segments were examined at each time point.

The slides were examined at 100x and 400x magnification with a Nikon Labophot microscope, and representative photographs were taken using either a Nikon CoolPix800 digital camera (experiments A and B) or a Nikon D60 camera (experiments C-E). All leaf sections were assessed for both hyphal coverage and number of incidences of penetration.

In experiments A-C, the hyphal coverage and incidence of penetration was assessed on a binary scale (i.e. a score of "0" was assigned if the trait was absent, and a score of "1" if it was present). In experiments D and E, the total area of each leaf segment was calculated by measuring its length and width at 100x magnification using an eyepiece micrometer. The hyphal coverage was expressed as a percent of leaf tissue obscured by hyphae. For the incidence of penetration, the total number of penetration events on each leaf was counted and divided by the total area of the leaf segment for intra-experimental comparisons.

6.2.5 Statistical methods Statistical analyses were conducted using the results from experiments D and E. For all statistical tests, a type 1 error threshold of 0.05 was used. All statistical analyses were completed

395

using SAS v. 9.1 (SAS Institute Inc., Cary, NC, USA). Data from each experiment were analysed separately. A non-inoculated "blank," treated with water, was included at each time point in both experiments. There were three replicates for each treatment condition. The fixed effects were the strain of inoculum applied, the identity of the host plant, and the date of collection.

The data were sorted by date of collection and the host plant of origin, then assessed for normality using the Shapiro-Wilk test (Ho: the samples come from a normally distributed population), and were tested for homogeneous variance using the Levene test (Ho: the variances of the data are equal). For data sets that were not normally distributed or that did not exhibit homogenous variance, the non-parametric Wilcoxon rank-sum test (also known as the Mann-

Whitney-U test; Ho: the samples come from identical populations) was used to compare the means within each time point.

A general linear model by date of collection and identity of the host plant was utilized. The classification variables were the fungal inoculum's species, host-plant origin, and isolate number.

The model was that the number of penetration incidences per unit area were dependent on the isolate number. The means and least significance difference (LSD) were calculated for each isolate. An example of the SAS statements used in these analyses may be found inAppendix 6.1.

6.3 Results

6.3.1 Experiment A The goal of experiment A was to observe the infection of wheat and Kentucky bluegrass leaves by M. nivale and M. majus. In this experiment, hyphal plugs were the sole source of inoculum used, and two isolates of each species were used (Figure 6.2). Under these conditions, penetration of the leaf tissue was first observed at 2 days post-inoculation (dpi) on both host

396

plants. On Kentucky bluegrass, both isolates of both M. nivale and M. majus penetrated the leaf tissue at 2 dpi, whereas on wheat, only the M. nivale isolates (both originally collected from grass) penetrated the leaf tissue at 2 dpi. Penetration of wheat leaves by M. majus was first observed at 3 dpi.

The mechanism of penetration was the same on both leaf types and for both M. nivale and M. majus: hyphae were observed growing directly into the stomata (Figure 6.3). When the leaves were heavily infected, hyphae were also observed growing out of the stomata (Figure 6.4).

Neither appressoria nor haustoria were observed on any leaf type or with either of the fungal species tested.

6.3.2 Experiment B The goal of experiment B was to determine whether the timing and/or the mechanism of penetration may differ when a conidial suspension, rather than a hyphal suspension, was used to apply inoculum to the leaf sections. Two isolates of M. majus and four isolates of M. nivale were included in this experiment, where two of the M. nivale isolates were originally collected from wheat, and the other two were collected from turfgrass. In addition, a suspension containing hyphal tissue, rather than a plug of agar (as in experiment A) was used to apply the hyphal inoculum.

For leaf segments inoculated with a hyphal suspension (Figure 6.5), similar trends were observed relative to those obtained in experiment A. On Kentucky bluegrass, five of the isolates, with the exception of one of the M. nivale isolates originally from turfgrass, penetrated the leaf tissue at 2 dpi. On the wheat leaves, only one of the M. majus and one of the M. nivale isolates from wheat had penetrated the leaf tissue by 2 dpi. At 3 dpi, both of the M. majus isolates, one of

397

the two M. nivale wheat isolates, and both of the turf isolates were observed penetrating the

Kentucky bluegrass leaf segments, and all six of the isolates had penetrated the wheat leaves.

Between experiments A and B, the application of hyphal inoculum by using a hyphal suspension rather than an agar plug yielded nearly identical results.

For leaf segments inoculated with a conidial suspension (Figure 6.6), penetration was delayed by one day relative to the leaves inoculated with hyphae. On Kentucky bluegrass, only the M. majus isolates were observed penetrating the leaf tissue at 3 dpi, whereas M. nivale was not observed penetrating the leaf tissue until 4 dpi. On wheat, the M. nivale isolate was never observed penetrating the leaf tissue. Only a single M. majus isolate had penetrated the leaf tissue at 3 dpi, and the incidence of penetration remained infrequent and inconsistent for both M. majus isolates until 6 dpi. As in experiment A, the sole mechanism of penetration observed was the direct penetration of stomata. Neither appressoria nor haustoria were observed.

6.3.3 Experiment C Experiment C was similar to experiment B, but in this experiment, the plates were incubated in the dark, rather than under full light. Both hyphal and conidial suspensions were used to inoculate detached leaf segments of wheat and Kentucky bluegrass. On Kentucky bluegrass inoculated with a hyphal suspension (Figure 6.7), both of the M. majus isolates were observed penetrating the leaf tissue at 1 dpi, whereas penetration for M. nivale was not observed until

2dpi. On wheat inoculated with a hyphal suspension (Figure 6.7), penetration was observed for only a single M. majus isolate at 1 dpi. The second M. majus isolate was observed penetrating the leaf tissue at 2 dpi, and the M. nivale isoalte penetrated the tissue at 3 dpi.

398

When wheat and Kentucky bluegrass were inoculated with conidia from two isolates of M. nivale, both originally collected from turfgrass (Figure 6.8), penetration was not observed until 3 dpi for either leaf type. As in experiments A and B the sole mechanism of penetration observed in experiment C was the direct penetration of stomata. Neither appressoria nor haustoria were observed.

6.3.4 Experiment D In experiment D, the only inoculum type applied was a hyphal suspension. The inoculated leaf sections were incubated in the dark at 4 °C, rather than at room temperature as in previous experiments. In addition, the method by which the presence or absence of hyphal colonization and penetration was recorded was changed relative to the previous experiments (Section 6.24).

In this experiment, the number of incidences of penetration by unit area on the two host plants and for each day were compared between the isolates, according to their species and their host plant of origin. Two isolates of M. majus and four of M. nivale (two from wheat and two from turf) were included in this experiment.

On Kentucky bluegrass (Figure 6.9), penetration was first observed by one of the M. nivale isolates from turf at 3 dpi. All six of the isolates included in this study penetrated the leaf tissue by 6 dpi. On wheat (Figure 6.9), one M. majus isolate penetrated the leaf tissue at 4 dpi. All of the isolates, with the exception of one M. majus isolate and one M. nivale isolate from turf, had penetrated the leaf tissue by 6 dpi.

Most of the data for each isolate's incidence of penetration at each date and between the two host plants were found to be non-normally distributed (p-value for Shapiro-Wilk test > 0.05) and/or exhibited non-homogeneous variance (p-value for Levine test < 0.05). For this reason, the

399

statistical significance of these data was assessed using the non-parametric Wilcoxon two-sample test, using the two-sided t approximation (Pr > |Z|) to identify significant differences in the mean number of penetration events between the two leaf types for each individual isolate (Table 6.4).

Using this test, none of the data were found to be statistically significant.

When the data from this experiment were pooled by the host origin of the fungal isolates, a different pattern emerged (Table 6.5). Because the data for the number of incidences of penetration on the two host plants were found to be non-normally distributed (p-value for

Shapiro-Wilk test > 0.05) and/or exhibited non-homogeneous variance (p-value for Levine test <

0.05), once again the Wilcoxon test was employed to identify significant differences within a host origin group. When the data were partitioned in this manner, the mean incidences of penetration differed significantly between the turf and wheat isolates on both Kentucky bluegrass and on wheat at 6 and 7 dpi: in both cases, a larger number of penetration occurrences were observed for the turf-derived isolates than the wheat-derived isolates on both host plants.

6.3.5 Experiment E The purpose of experiment E was to repeat the previous room-temperature experiments using a hyphal suspension but with the updated scoring system (experiment D) to record the incidences of leaf colonization and penetration (Figure 6.10). On Kentucky bluegrass, penetration of the leaf tissue was observed at 2 dpi for leaf segments inoculated with one of the M. majus isolates, one

M. nivale turf isolate, and one M. nivale wheat isolate. At 3 dpi, all isolates with the exception of one M. nivale wheat isolate were observed penetrating the leaf tissue. On wheat, one M. majus isolate and one M. nivale isolate from wheat were observed penetrating the leaf tissue at 2 dpi.

400

At 3 dpi, all of the isolates with the exception of one of the M. nivale wheat isolates had penetrated the leaf tissue.

In experiment E, non-normality and / or non-homogenous variance were found for both the by-isolate and the by-host-origin datasets, so the data were analysed as described for experiment

D. In experiment E, none of the means in either analysis were found to differ significantly (Table

6.6 and Table 6.7).

6.4 Discussion

Several independent experiments were conducted to elucidate the physical mechanisms and the timing of the infection processes of M. nivale and M. majus on the representative host plants

T. aestivum and P. pratensis. The influences of inoculum type (conidial vs. hyphal), incubation temperature (4 °C vs. 23 °C), and host plant of origin (wheat vs. turfgrass) were investigated by observing the rate of fungal colonization of the plant tissue and by counting the incidences of penetration into the plant tissue. All of the isolates used in these experiments displayed approximately the same rate of growth on PDA at room temperature, which is correlated to virulence (Hofgaard et al. 2006).

Penetration of hyphae into the plant tissue was observed in every experiment conducted.

Neither appressoria nor haustoria were observed in any experiment or on either host plant; instead, the hyphae were observed growing directly into the stomata without the formation of specialized structures (Figure 6.3). In addition, when the plant tissue was heavily colonized, hyphae were observed growing out of the stomata (Figure 6.4). These observations are in line with early descriptions of the infection process (Dahl 1934). However, two recent studies, both using the same single strain of M. nivale (originally collected from rye) to prepare all of the 401

inoculum, examined the infection process of M. nivale on the hybrid cereal Triticale and on rye

(Secale cereale) and suggested that this pathogen forms haustoria or haustorium-like structures within the plant tissue (Dubas et al. 2011; Zur et al. 2011). The lack of haustoria in the experiments reported herein may be due to the use of detached senescing leaves in the experiments described in this thesis rather than attached vigorous leaves as were used in the experiments by Dubas et al (2011) and by Zur and colleagues (2011). Biotrophy may be an important strategy for these pathogens when the plant is actively growing, but when the leaves are inactive or dead, as in these experiments, saprotrophy may be employed instead.

One goal of the experiments described in this Chapter was to explore the claim that conidia alone do not cause infection on plant tissue. In both experiments that included conidial inoculum, the leaves inoculated with conidial suspensions displayed a much lower rate of colonization relative to leaves inoculated with hyphae and incubated under the same conditions. These same conidial suspensions were known to contain living spores capable of normal growth because aliquots of the spore suspensions used in all experiments were incubated on artificial media and were observed to germinate and to produce colonies identical to those established using actively- growing hyphae. Despite this broad similarity, the experiments including conidia revealed large differences between the different strains used. In experiment B, the conidia of the single M. nivale isolate studied were not observed to germinate and to subsequently infiltrate the wheat tissue within the timeframe of the experiment; however, in experiment C, both of the M. nivale isolates studied penetrated the leaf tissue of wheat by 4 dpi. While one of the two M. nivale isolates used in experiment C was observed penetrating the wheat tissue at 3 dpi, and the observation of penetration remained consistent for the remaining duration of the experiment, the other isolate studied was first observed penetrating the wheat at 4 dpi, and was not observed 402

penetrating the leaf tissue consistently until 6 dpi. The M. majus isolates included in experiment

B also yielded inconsistent results, and also failed to consistently penetrate the leaf tissue until 6 dpi.

On Kentucky bluegrass, the M. nivale isolate included in experiment B yielded inconsistent results. Penetration of the leaf tissue was first observed at 4 dpi, but was not observed again until

7 dpi. In contrast, both of the M. nivale isolates included in experiment C consistently penetrated the leaf tissue from 3 dpi through to the end of the observation period. The M. majus conidia used in experiment B consistently penetrated the tissue of Kentucky bluegrass every day after 3 dpi.

The inconsistent observation of penetration for most of the isolates included in these studies, in addition to the failure of conidia to produce rapid hyphal coverage on the surface of the plant leaves, suggests that the surface of the plant is a less favourable environment for spore germination than artificial media. Because detached leaf sections, rather than whole and / or living leaves were used in these experiments, it is unlikely that this observation can be explained by the activation of the plant's active defences. Instead, a more likely explanation is that the properties of the leaves' surfaces, such as the waxy cuticle, may have prevented the conidia from adhering long enough to germinate. This suggestion is corroborated by the fact that "stray" ungerminated conidia were not observed on the leaf surface at any time.

Together, the observations that conidia are capable of rapid germination and growth on artificial media, but apparently not on plant tissue, corroborates earlier claims that the conidia of

M. nivale and M. majus do not cause plant disease directly. Instead, conidia may be important in the dispersal of these pathogens. After dispersal, the conidia could germinate in the thatch or on

403

dead plant tissue and grow saprotrophically, producing a mycelial mass that may later cause infection of the plant when conditions become favourable for the fungus.

In all experiments, the hyphal inoculum of both M. nivale and M. majus, regardless of the host species of origin, colonized and penetrated the leaf tissue of both of the host species studied,

P. pratensis and T. aestivum. Regardless of whether it was delivered via a plug of agar or by a hyphal suspension, hyphal inoculum resulted in the colonization and penetration of the leaf tissue within 48 hours post-inoculation in most cases. Penetration was observed by 72 hours post- inoculation in most cases when conidial inoculum was used. Together, these experiments suggest that both M. nivale and M. majus, regardless of the host plant from which they were originally isolated, are capable of penetrating the leaf tissue of both turfgrasses and cereals. The observed host preferences in the field may thus reflect environmental and biotic factors that might limit spread.

Despite the overall trends in the timing of infection as described above, some inconsistencies were observed within and between experiments. Although a hyphal suspension was used in both experiments C and E, and the leaf tissue was incubated at the same temperature in both experiments, the timing of penetration by M. majus isolate 99049 was delayed by 1 day on both leaf types in experiment C relative to experiment E. In contrast, for the M. nivale turf isolates, the penetration of Kentucky bluegrass was first observed at 2 dpi in both experiments E and C, but at 3 dpi in experiment C and at 2 dpi in experiment E. These conflicting observations suggest that, although there were no consistent differences between the observations for the isolates included in the experiments described, the trends observed for individual isolates were variable.

These differences underscore the necessity for these observations to be interpreted as general

404

trends in the timing when penetration may be expected, and also the need for a greater number of replications so that stochastic variation can be taken into account.

Among the M. nivale isolates studied, the host species from which the isolate was originally collected did not exhibit a strong influence on the isolate's ability to penetrate the tissues of either of the host plants under study. This was unexpected, because genotypic differences have been identified between populations of M. nivale growing on different turfgrass species (Mahuku et al. 1998), and M. nivale isolates from wheat are genetically distinct from those originating from turf (see Chapters 2 and 3). Although M. majus has only been isolated from cereals, and has not been reported on turfgrasses (Maurin et al. 1995), this pathogen is clearly capable of colonizing and penetrating the tissue of at least one turfgrass species under artificial inoculation conditions. One explanation for the apparent lack of M. majus on grasses in the field may be that

M. nivale or other phylloplane organisms may simply out-compete M. majus on turfgrasses, perhaps due to differences in their abilities to either trigger or resist plant defensive processes.

For example, M. majus has been shown to out-compete M. nivale on rye (Secale cereale)

(Hofgaard et al. 2006), and also displays a decreased sensitivity to the plant defensive compound benzoxazolinone, which is produced in large quantities by rye (Simpson et al. 2000). A more detailed analysis of the plant's responses to the presence of these pathogens (e.g. the identification and monitoring of transcript levels of defensive enzymes) may provide further insight into the relationships between these pathogens and their preferred hosts.

In addition to the experiments conducted at room temperature, the inoculated leaves in experiment D were incubated at 5 °C. A lower temperature was chosen to simulate the conditions in the field at which M. nivale and M. majus typically cause disease. At the cooler temperature, the timing of surface colonization and penetration was delayed, but not prevented. The trends 405

observed, with respect to the slightly earlier penetration by pathogens on their native hosts, were unchanged. This delay in the penetration of plant tissues is in line with the slower growth rate of these pathogens on artificial media at these lower temperatures (Snider et al. 2000). Despite this reduced growth rate, the ability of these pathogens to remain active at low temperatures (i.e. under snow or between 0-15 °C) may thus allow them to exploit their host plants when other pathogens are inactive. For this reason, it may be possible that M. nivale and M. majus cause more frequent disease at cooler rather than warmer temperatures simply because they are out- competed by other pathogens in the field.

Taken together, the results presented in this chapter suggest that there are no obvious differences in the mechanisms by which M. nivale and M. majus attack their host plants. Both species were capable of colonizing and penetrating plant tissue from both their native- and non- native hosts, and the temperature of incubation delayed but did not otherwise alter these patterns.

Hyphal inoculum resulted in the rapid colonization of plant tissue, whereas conidial inoculum yielded inconsistent results, but generally resulted in a slower rate of tissue colonization.

406

6.5 References for Chapter 6

Agrios, G.N. 2005. Plant Pathology. Elsevier Academic Press, Burlington, MA. Brunner, F., and Petrini, O. 1992. Taxonomy of some Xylaria species and xylariaceous endophytes by isozyme electrophoresis. Mycological Research 96(9): 723-733. Catanzariti, A.-M., Mago, R., Ellis, J., and Dodds, P. 2011. Constructing Haustorium-Specific cDNA Libraries from Rust Fungi. In Plant Immunity: Methods and Protocols, Methods in Molecular Biology. Edited by J.M. McDowell. Dahl, A.S. 1934. Snowmold of turf grasses caused by Fusarium nivale. Phytopathology 24: 197- 214. Davis, E.C., Franklin, J.B., Shaw, J., and Vilgalys, R. 2003. Endophytic Xylaria (Xylariaceae) among liverworts and angiosperms: phylogenetics, distribution, and symbiosis. American Journal of Botany 90(11): 1661-1667. Dubas, E., Golebiowska, G., Zur, I., and Wedzony, M. 2011. Microdochium nivlae (Fr., Samuels & Hallett): cytological analysis of the infection process in triticale (x Triticosecale Wittm.). Acta Physiologiae Plantarum 33: 529-539. Heath, M.C., and Skalamera, D. 1997. Cellular interactions between plants and biotrophic fungal parasites. Advances in Botanical Research 24: 195-225. Hofgaard, I.S., Wanner, L.A., Hageskal, G., Henriksen, B., Klemsdal, S.S., and Tronsmo, A.M. 2006. Isolates of Microdochium nivale and M. majus differentiated by pathogenicity on perennial ryegrass (Lolium perenne L.) and in vitro growth at low temperature. Journal of Phytopathology 154(5): 267-274. Howard, R.J., Valent, B. 1996. Breaking and entering: host penetration by the fungal rice blast pathogen Magnaporthe grisea. Annual Review of Microbiology 50: 491-512. Ioos, R., Belhadj, A., and Menez, M. 2004. Occurrence and distribution of Microdochium nivale and Fusarium species isolated from barley, durum and soft wheat grains in France from 2000 to 2002. Mycopathologia 158(3): 351-362. Khan, A., and Hsiang, T. 2006. The infection process of Colletotrichum graminicola and relative aggressiveness on four turfgrass species. Canadian Journal of Microbiology 49: 433-442. Mahuku, G.S., Hsiang, T., and Yang, L. 1998. Genetic diversity of Microdochium nivale isolates from turfgrass. Mycological Research 102: 559-567. Maurin, N., Rezanoor, H.N., Lamkadmi, Z., Some, A., and Nicholson, P. 1995. A comparison of biological, molecular, and enzymatic markers to investigate variability within Microdochium nivale (Fries) Samuels and Hallett. Agronomie 15(1): 39-47. Mendgen, K., Hahn, M., and Deising, H. 1996. Morphogenesis and mechanisms of penetration by plant pathogenic fungi. Annual Review of Phytopathology 34: 367-386. Mendgen, K., Struck, C., Voegele, R.T., and Hahn, M. 2000. Biotrophy and rust haustoria. Physiological and Molecular Plant Pathology 56: 141-145. Pronczuk, M., and Messyasz, M. 1991. Infection ability of mycelium and spores of Microdochium nivale (Fr.) Samuels & Hallett to Lolium perenne L. Mycotoxin Research 7A. Simpson, D.R., Rezanoor, H.N., Parry, D.W., and Nicholson, P. 2000. Evidence for differential host preference in Microdochium nivale var. majus and Microdochium nivale var. nivale. Plant Pathology 49(2): 261-268.

407

Smith, J.D. 1983. Fusarium nivale (Gerlachia nivalis) from cereals and grasses - is it the same fungus? Canadian Plant Disease Survey 63(1): 25-26. Snider, C.S., Hsiang, T., Zhao, G.Y., and Griffith, M. 2000. Role of ice nucleation and antifreeze activities in pathogenesis and growth of snow molds. Phytopathology 90(4): 354-361. Tronsmo, A.M., Hsiang, T., Okuyama, H., and Nakajima, T. 2001. Low temperature diseases caused by Microdochium nivale. In Low temperature plant microbe interactions under snow. Edited by D.A. Gaudet, Tronsmo, A.M., Matsumoto, N., Yoshida, M., and Nishimune, A. Hokkaido national Agricultural Experiment Station, Japan. Webster, J., and Weber, R.W.S. 2007. Introduction to Fungi, 3rd edition. Cambridge University Press, New York. Zur, I.A., Dubas, E., Pociecha, E., Dubert, F., Kolasinska, I., and Plazek, A. 2011. Cytological analysis of infection process and the first defence responses induced in winter rye (Secale cereale L.) seedlings inoculated with Microdochium nivale Physiological and Molecular Plant Pathology 76: 189-196.

408

Table 6.1 Summary of conditions tested in infection process experiments performed for M. nivale and M. majus inoculated on P. pratensis and T. aestivum.

Inoculum Number of Isolates Incubation Light Experiment Types* M. nivale M. majus Temperature (°C) condition A HP 2 2 23 full light† B HP, CS 3 5 23 full light C HS, CS 1 3 23 darkness D HS 2 4 4 darkness E HS 3 3 23 darkness * HP = hyphal plug; CS = conidia suspension; HS = hyphal suspension

†50 µmol/m2/s for 24 h each day

409

Table 6.2 Isolates of M. nivale and M. majus used in infection process experiments on T. aestivum and P. pratensis.

Geographic Experiment and Inoculum Type* Isolate Species Host Origin Origin A B C D E 99049 M. majus T. aestivum Atwood, ON HP HS, CS HS HS HS 99061 M. majus T. aestivum Atwood, ON HP HS - - - 10148 M. majus T. aestivum France - CS HS - - 12045 M. majus T. aestivum Ottawa, ON - - - - HS 12430 M. majus T. aestivum Ottawa, ON - - - HS HS 10085 M. nivale P. annua Guelph, ON HP CS HS - - 10086 M. nivale P. annua Guelph, ON HP - - - - 11011 M. nivale P. pratensis Guelph, ON - HS - - - 11016 M. nivale P. pratensis Guelph, ON - - CS - - 11228 M. nivale P. annua Guelph, ON - HS CS - - 99084 M. nivale T. aestivum Atwood, ON - HS - - - 99052 M. nivale T. aestivum Atwood, ON - HS - - - 12260 M. nivale T. aestivum Ottawa, ON - - - HS HS 12262 M. nivale T. aestivum Ottawa, ON - - - HS HS 12267 M. nivale P. pratensis Guelph, ON - - - HS HS 12438 M. nivale P. pratensis Guelph, ON - - - HS - * HP = hyphal plug; HS = hyphal suspension; CS = conidia suspension; - = not included

410

Table 6.3 Sample collection timepoints for infection process studies of M. nivale and M. majus on T. aestivum and P. pratensis.

Collection times* Experiment Hours post-inoculation Days post-inoculation A 6, 8, 10, 12, 24 2, 3, 4. 5, 6, 7 B - 1, 2, 3, 4. 5, 6, 7 C 12, 14, 16, 18, 20, 24 2, 3, 4. 5, 6, 7 D - 1, 2, 3, 4. 5, 6, 7 E - 1, 2, 3, 4. 5, 6, 7 * All collection times that occurred within the first 24 hours following inoculation are listed in the hours post-inoculation column; all other collection times are listed in the days post- inoculation column.

411

Table 6.4 Comparisons between mean number of penetration observations per unit area for each isolate of Microdochium nivale (Mn) and M. majus (Mm) on detached leaves of Kentucky bluegrass (K) and wheat (W) in experiment D. Data for each isolate at each time point were compared using a two-sided Wilcoxon rank sum approximation. Means followed by different letters were significantly different at p = 0.05.

Days 12260 12262 12267 12438 12430 99049 H O post- (MnW) (MnW) (MnK) (MnK) (MmW) (MmW) 2 ioculation K W K W K W K W K W K W K W 1 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 2 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 3 0a 0a 0a 0a 0a 0a 41a 0a 0a 0a 0a 0a 0a 0a 4 0a 0a 0a 0a 7a 0a 22a 0a 2a 0a 0a 0a 0a 0a 5 0a 0a 24a 4a 0a 0a 689a 11a 0a 0a 0a 0a 0a 0a 6 357a 11a 11a 0a 1009a 55a 231a 49a 21a 0a 4a 4a 0a 0a 7 60a 13a 532a 56a 3973a 180a 3887a 155a 41a 0a 323a 61a 0a 0a

412

Table 6.5 Comparisons between mean number of penetration observations per unit area for all isolates, regardless of identity, from each host type on Kentucky bluegrass and wheat in experiment D. Data for each host type at each time point were compared using a two-sided

Wilcoxon rank sum approximation. Means followed by different letters were significantly different at p = 0.05.

Kentucky bluegrass Wheat Days post- inoculation turf-derived wheat-derived turf-derived wheat-derived isolates isolates isolates isolates 1 0a 0a 0a 0a 2 0a 0a 0a 0a 3 21a 0b 0a 0a 4 14a 0a 0a 0a 5 345 6a 5a 1a 6 620a 216b 52a 4b 7 3930a 217b 168a 54b

413

Table 6.6 Comparisons between mean number of penetration observations per unit area for each isolate of Microdochium nivale (Mn) or M. majus (Mm) on detached leaves of Kentucky bluegrass (K) and wheat (W) in experiment E. Data for each isolate at each time point were compared using a two-sided Wilcoxon rank sum approximation. Means followed by different letters were significantly different at p = 0.05.

12260 12267 12430 12045 99049 12262 (MnW) H O Days post- (MnW) (MnK) (MnK) (MmW) (MnW) 2 inoculation K W K W K W K W K W K W K W 1 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 2 0a 0a 7a 3a 1a 0a 0a 0a 3a 0a 0a 0a 0a 0a 3 0a 0a 230a 49a 42a 4a 190a 0a 368a 49a 21a 0.4a 0a 0a 4 1a 0a 150a 45a 69a 9a 349a 0a 376a 5a 82a 81a 0a 0a 5 270a 0a 273a 63a 166a 63a 367a 25a 464a 56a 147a 51a 0a 0a 6 357a 3a 653a 156a 379a 60a 297a 146a 465a 113a 433a 179a 0a 0a 7 646a 101a 275a 58a 572a 242a 466a 95a 579a 222a 386a 223a 0a 0a

414

Table 6.7 Comparisons between mean number of incidences of penetration per unit area for all isolates from each host type on detached leaves of Kentucky bluegrass and wheat in experiment

E. Data for each host type at each time point were compared using a two-sided Wilcoxon rank sum approximation. Means followed by different letters were significantly different at p = 0.05.

Kentucky bluegrass host Wheat host Days post- inoculation turf-derived wheat-derived turf-derived wheat-derived isolates isolates isolates isolates 1 0a 0a 0a 0a 2 1a 1a 0a 0a 3 25a 2a 155a 116a 4 33a 6a 209a 153a 5 67a 44a 267a 265a 6 113a 103a 477a 338a 7 158a 156a 522a 470a

415

Infection

Host recognition Invasion

Penetration Pathogen growth

Attachment Symptom development

Incubation Pathogen dormancy

Inoculum production and dissemination

Figure 6.1 The disease cycle describing the events that occur during a host-pathogen interaction

(modified from (Agrios 2005))

416

A 3

) 2

P. pratensis ( 1

Number of leaves penetrated leaves of Number

0 1 2 3 4 5 6 7 B Days post-inoculation 3

)

2

1

Triticum aestivum

(

Number of leaves penetrated leaves of Number 0 1 2 3 4 5 6 7 Days post-inoculation 99049 (M. majus) 99061 (M. majus) 10085 (M. nivale) 10086 (M. nivale)

Figure 6.2 Number of detached leaf segments of P. pratensis (A) and T. aestivum (B) where penetration was observed at time of collection for leaf blades treated with hyphal inoculum in experiment A. Three leaf blades were collected at each time point.

417

Figure 6.3 Penetration of stomata of detached leaves of T. aestevum by hyphae of M. majus isolate 99049 (stained blue) incubated on moistened filter paper and incubated at 22 ºC. Photo taken at 3 dpi, 400 x magnification.

418

Figure 6.4 Hyphae of M. majus isolate 99061 emerging from stomata of detached leaves of wheat (circled) incubated on moistened filter paper and incubated at 22 ºC. Photo taken at 400x magnification, 5 dpi.

419

A 3

) 2

P. pratensis ( 1

Number of leaves penetrated leaves of Number

0 1 2 3 4 5 6 7 Days post-inoculation B

3

)

2

Triticum aestivum Triticum 1

(

Number of leaves penetrated leaves of Number

0 1 2 3 4 5 6 7 Days post-inoculation

99084 (M. nivale wheat isolate) 99052 (M. nivale wheat isolate) 99061 (M. majus wheat isolate) 99049 (M. majus wheat isolate) 11011 (M. nivale turf isolate) 11228 (M. nivale turf isolate)

Figure 6.5 Number of detached leaf segments of P. pratensis (A) and T. aestivum (B) incubated on moist filter paper where penetration was observed at time of collection for leaf blades treated with hyphal inoculum in experiment B. Three leaf blades were collected at each time point.

420

A 3

) 2

P. pratensis 1

(

Number of leaves penetrated leaves of Number 0 1 2 3 4 5 6 7 Days post-inoculation B 3

)

2

1

Triticum aestivum Triticum

(

Number of leaves penetrated leaves of Number 0 1 2 3 4 5 6 7 Days post-inoculation

99049 (M. majus) 10148 (M. majus) 10085 (M. nivale)

Figure 6.6 Number of detached leaf segments of P. pratensis (A) and T. aestivum (B) incubated on moist filter paper where penetration was observed at time of collection for leaf blades treated with conidial inoculum in experiment B. Three leaf blades were collected at each time point.

421

A 3

) 2

1

P. pratensis

(

Number of leaves penetrated leaves of Number 0 1 2 3 4 5 6 7 Days post-inoculation B

3

)

2

Triticum aestivum Triticum 1

(

Number of leaves penetrated leaves of Number

0 1 2 3 4 5 6 7 Days post-inoculation

99049 (M. majus) 10148 (M. majus) 10085 (M. nivale)

Figure 6.7 Number of detached leaf segments of P. pratensis (A) and T. aestivum (B) incubated on moist filter paper where penetration was observed at time of collection for leaf blades treated with hyphal inoculum in experiment C.

422

A Wheat exp B 3.0

) 2.0

P. pratensis ( 1.0

Number of leaves penetrated leaves of Number 0.0 1 2 3 4 5 6 7

Days post-inoculation

B 3

)

2

1

Triticum aestivum

(

Number of leaves penetrated leaves of Number 0 1 2 3 4 5 6 7 Days post-innoculation 11016 (M. nivale) 11228 (M. nivale)

Figure 6.8 Number of detached leaf segments of P. pratensis (A) and T. aestivum (B) incubated on moist filter paper where penetration was observed at time of collection for leaf blades treated with conidial inoculum in experiment C.

423

14000 A 12000

10000

) 8000

6000

P. pratensis ( 4000

2000

Number of leaves penetrated leaves of Number

0 1 2 3 4 5 6 7

Days post-inoculation B 600

500

)

400

300

200

Triticum aestivum Triticum

(

100

Number of leaves penetrated leaves of Number

0 1 2 3 4 5 6 7

Days post-inoculation

12430 (M. majus wheat) 99049 (M. majus wheat) 12267 (M. nivale turf) 12438 (M. nivale turf) 12260 (M. nivale wheat) 12262 (M. nivale wheat)

Figure 6.9 Number of incidences of penetration per unit area on P. pratensis (A) and T. aestivum

(B) incubated on moist filter paper at time of collection for leaf blades treated with hyphal inoculum in experiment D.

424

A 800

600

)

400

P. pratensis

(

200

Number of leaves penetrated leaves of Number

0 1 2 3 4 5 6 7 Days post-inoculation B 2500

) 2000

1500

1000

Triticum aestivum

(

Number of leaves penetrated leaves of Number 500

0 1 2 3 4 5 6 7 Days post-inoculation

12045 (M. majus wheat) 99049 (M. majus wheat) 12267 (M. nivale turf) 12438 (M. nivale turf) 12260 (M. nivale wheat) 12262 (M. nivale wheat)

Figure 6.10 Number of incidences of penetration per unit area on P. pratensis (A) and T. aestivum (B) incubated on moist filter paper at time of collection for leaf blades treated with hyphal inoculum in experiment E.

425

Appendices for Chapter 6

Appendix 6.1: SAS Statements

* FILENAME: 130521_expE.sas; * DATE: 13.05.21;

* Data from infection study experiment;

* Precede comments with an asterisk, and end with semicolon; data temp; options pagesize=200 linesize=120; infile cards; * dlm='09'x; * dlm is delimiter and '09'x is the ascii symbol for tabs; input day $ isolate $ species $ origin $ host $ leaf_no pen_by_area; cards; 1 12045 M W K 1 0 1 12045 M W K 2 0 ... ; run; proc sort; by day host; proc univariate normal; by day host; var pen_by_area; run; proc npar1way; by day host; class origin; run; proc glm; by day host; class species isolate origin; model pen_by_area = origin; means origin/ LSD; means origin / hovtest; run;

proc sort; by day isolate; proc glm; by day isolate; class species origin host; model pen_by_area = host; means host/ LSD; means host / hovtest; run; proc univariate normal; by day isolate; var pen_by_area; run; proc npar1way; by day isolate; class host; run;

426

Chapter 7 General Discussion and Conclusions

7.1 Major conclusions

The major conclusions from the research presented in this thesis are as follows:

1. Microdochium nivale and M. majus exhibit consistent genetic and genomic differences

that validate their status as distinct species (Chapters 2 and 3). The genera Microdochium

and Monographella (in which these species have been placed) have not yet been

reviewed under the one fungus-one name system (Taylor 2011). Because the names

based on the asexual stage are older (have priority), are better known, and are of less

uncertain application than the names based on the sexual stage, they have been used in

this thesis.

2. Intraspecific genetic differences exist within M. nivale, both within isolates collected

from a single type of host plant and between isolates collected from different species

(Chapters 3 and 4). These findings are in agreement with previous reports that M. nivale

has a high level of genetic variation.

3. The formation of ascospores (and hence the implication of sexual reproduction) was

induced in the lab for M. majus, but could not be confirmed in M. nivale. The mating type

genes in both species and in other members of the same order failed to conform

consistently to the syntenic arrangement of this genetic region that was predicted based

on other fungi. In addition, only the MAT1-2-1 mating gene was detected among all of

the Microdochium and Xylariales strains examined. As a result, although M. majus

displayed apparently homothallic mating, it was not possible to label M. nivale as

homothallic, heterothallic, or pseudohomothallic (Chapter 5).

427

4. Despite host preferences reported from field studies, infection process studies on

detached leaves did not reveal pathogenic differences between M. nivale and M. majus

and between isolates collected from different plant hosts. When infection occurred,

neither appressoria nor haustoria were observed; however, this observation may have

been influenced by the use of detached leaves, rather than whole plants (Chapter 6).

7.2 General Discussion and Conclusions

The major goal of this thesis was to investigate differences between Microdochium nivale and M. majus, two fungal plant pathogens that were recently recognized as distinct species

(Glynn et al. 2005) rather than conspecific varieties. This nomenclatural change was preceded by claims that the recognition of M. nivale and M. majus as separate varieties was not meaningful based on morphological observations (Litschko and Burpee 1987). As a result, the projects described herein were undertaken to explore this apparent contradiction. At the beginning of the research project (Chapter 2), genetic differences between M. nivale and M. majus were investigated by examining the nucleotide sequences of four genomic regions for a small sample of isolates from various host plant and geographic origins. Among the four regions examined, only ITS failed to resolve the species into distinct clades; the remaining genetic regions examined confirmed that M. nivale and M. majus are genetically distinct. Further divisions were resolved within M. majus, where the isolates from Europe and North America were grouped separately by the analysis of RPB2, and within M. nivale, for which the sequences of β-tubulin and RPB2 suggested that isolates obtained from wheat are genetically distinct from turfgrass isolates.

428

The findings of Chapter 2 were further investigated by using next-generation sequencing technology to produce draft genome sequences for several isolates of M. nivale and M. majus and one isolate of M. bolleyi, a root-inhabiting species (Chapter 3). When these whole-genome sequences were compared within and between species, several interesting trends were observed.

Despite originating from North America and Europe, respectively, the genomes of two M. majus isolates were found to be nearly identical in terms of the number and the identities of their predicted gene sequences. A similar trend was observed for the two M. nivale wheat-derived isolates sequenced, which were also collected from two different continents. In contrast, the turf isolate of M. nivale exhibited genome-wide differences relative to the wheat isolates. The M. nivale turf isolate shared between 78-79% of its predicted genes with the other M. nivale and M. majus isolates at an e-value of 1e-50, the two M. nivale wheat isolates shared about 92% of their predicted genes, and the two M. majus isolates shared 95% of their predicted genes. When a tree was constructed using 10 randomly selected sequences from Microdochium, the turf isolate was found to be as dissimilar from the other M. nivale and M. majus sequences as it was from M. bolleyi. The biotypes of M. nivale and M. majus on different host plants appear to be genetically distinct, and gene exchange may be limited between isolates from different host species. This finding is broadly congruent with reports that the population of M. nivale on turfgrass may be distinct from that of cereals (Smith 1983).

Microdochium nivale and M. majus exhibit different sensitivities to plant defensive enzymes (Simpson et al. 2000), which is likely related to their host preferences and may prevent their interaction in the field. The presence of different sensitivities to host defensive enzymes or compounds within either species is unknown, but is a potential area for future investigation. In addition, the observation that the genomes of M. nivale isolates displayed a lower level of 429

similarity with one another than did the genomes of M. majus isolates is also in agreement with studies that examined RAPD patterns (Lees et al. 1995) as well as RFLP patterns of ITS and esterase protein profiles (Maurin et al. 1995), which revealed that M. nivale isolates displayed a higher level of variability than M. majus.

However, the suggestion that individual isolates may exhibit strong host specialization contrasts with the observations in Chapter 6 that all isolates of M. nivale and M. majus examined were capable of causing infection on both Triticum aestivum and Poa pratensis regardless of their host origin. This observation is in agreement with observations of M. nivale and M. majus on wheat seedlings, wherein host origin and fungal species were not strongly correlated with aggressiveness (Maurin et al. 1995).The observations of the physical infection process in this thesis may have been affected by the conditions used in these experiments: by inoculating spore suspensions of single fungal isolates on detached plant leaves under sterile conditions, and the lack of competition by other microbes may have permitted fungi to invade successfully where they would have failed in the field. Similarly, when the infection process experiments were performed at 23 °C compared to 4 °C, the timing of the infection was delayed, but otherwise the physical process of infection was unchanged in either case. This suggests that the ability of these pathogens to cause disease at low temperatures in the field does not preclude them from infecting plant tissue at warmer temperatures; instead, a more likely explanation for their scarcity as disease-causing agents at higher temperatures is likely the presence of microbial competitors or the increased activity of plant defences. In co-inoculation studies, host preferences have been observed between M. nivale and M. majus on oat (Avena sativa), barley (Hordeum vulgare), and wheat T. aestivum (Simpson et al. 2000), However, when M. nivale, M. majus, and Fusarium culmorum (a wheat pathogen) were inoculated on wheat or rye, the colonization success of each 430

species was dependent on both the presence or absence of other pathogen species as well as the temperature of inoculation (Simpson et al. 2004).

Throughout the infection process experiments, neither appressoria nor haustoria were observed. This contrasts with reports that haustoria were produced during the infection of M. nivale on the hybrid cereal triticale (x Triticosecale) and on rye (Secale cereale) (Dubas et al.

2011; Zur et al. 2011). Aside from possible host species differences, this may also be related to the use of detached leaves in the current experiments rather than whole plants (as done in the cited studies). This finding is important when contrasted with an earlier report that, at least in terms of isolate aggressiveness, detached leaf assays provided a meaningful proxy for experiments with whole plants (Diamond and Cooke 1999). To investigate the importance of competition in the host- and temperature-specificities observed in the field, experiments using whole plants may help to clarify this apparent contradiction. It may also be valuable to prepare fluorescently labeled strains of M. nivale and M. majus, which would facilitate the monitoring of the infection process of these pathogens in the presence of microbial competitors. Although attempts were made to create such strains, they were ultimately unsuccessful and were not reported in this thesis.

Another tool to investigate genetic diversity in relation to host-specific populations is the examination of microsatellite regions by ISSR and / or SSR. Within this thesis, this resource was applied to isolates of M. nivale collected from two proximate locations across three years

(Chapter 4). Unique haplotypes were observed for almost every isolate examined. The high level of diversity observed is similar to that reported in an earlier study of M. nivale from three turfgrass hosts, wherein 96 out of 100 isolates tested displayed a unique genotype when amplified by RAPD (Mahuku et al. 1998). The ISSR and SSR primers identified in Chapter 4 431

could be used to assess genetic diversity among isolates of M. nivale and M. majus collected from a variety of grass and cereal hosts. This investigation could help to clarify whether fungal isolates may be capable of attacking several different host plants in the field. Outside of identifying differences between fungal populations on different host plant species, the application of ISSR and SSR to isolates collected from a single economically important host, such as T. aestivum, may be valuable.

A priority for future research pertaining to the genomic differences observed is the sequencing of at least one additional turf isolate of M. nivale. This would help to clarify whether some of the differences observed between the wheat and turf isolates are truly due to host- specific variation. Alternatively, it may be possible that turf-derived isolates as a group may be more diverse than wheat-derived isolates; although this direct comparison was not assessed within the thesis, the ISSR and SSR work confirmed that turfgrass isolates of M. nivale are indeed genetically diverse. The investigation of M. nivale and M. majus isolates from other cereals, such as oats and barley would clarify not only the relationship of these pathogen populations to those examined, but may also help to demonstrate whether differing hosts really do represent a barrier to genetic flow among these species.

The yearly emergence of distinct genotypes observed among M. nivale is generally consistent with earlier hypotheses (e.g. (Mahuku et al. 1998)) that sexual reproduction is infrequent but present in this species. To explore the genetic basis for mating in M. nivale and M. majus, an attempt was made to identify the mating-type genes in those species using PCR primers designed for other filamentous ascomycetes (Chapter 5). These initial experiments failed even after considerable effort, and the MAT1 genes were only identified after the genome sequencing data became available. This difficulty was explained by the poor nucleotide-level 432

conservation of the MAT1 genes in general. Despite previous reports that M. majus is homothallic (e.g. (Maurin et al. 1995)) and that M. nivale may be either homothallic or heterothallic (Litschko and Burpee 1987; Maurin et al. 1995), only MAT1-2-1 was detected in the genomes of all six Microdochium isolates sequenced. When additional isolates of both species were examined by PCR, the candidate MAT1-2-1 sequence was amplified in every isolate tested, regardless of host plant or geographic origin. Furthermore, the sequences were detected between the APN2 and SLA2 genes, which have been reported to flank the MAT1 genes among the Sordariomycetes (Butler 2007). This synteny suggests that the sequence found truly was MAT1-2-1. A candidate sequence for MAT1-1-1 was not detected among any of the

Microdochium isolates nor among the genomes of any of the other Xylariales examined. Despite possessing only a single mating type, unpaired isolates of M. majus produced asci and ascospores when incubated on wheat straw, demonstrating apparently homothallic reproduction without one of the two genes reported to be essential for this process (Butler 2007). Perithecia were observed in some M. nivale pairings, but no ascospores were observed; although this result is in general agreement with reports that M. nivale produces perithecia less frequently than M. majus (Lees et al. 1995), the difference was not readily explained by the apparent presence of only the MAT1-2-

1 gene among more than 90 M. nivale isolates tested.

These observations in addition to the lack of a candidate MAT1-1-1 homolog from any of the other members of the order Xylariales led to the hypothesis that a yet-undetected gene may be responsible for the control of mating among this order. Other species that are not easily classified as homothallic, heterothallic, or pseudohomothallic based on observations of their mating behaviour or genetic information have been identified among other genera of

Sordariomycetes (Chen et al. 2002; Lin and Heitman 2007; Menat et al. 2012; Rodriguez-Guerra 433

et al. 2005; Vaillancourt et al. 2000) and corroborate the need for a more thorough investigation and analysis of the control of mating among the Ascomycota.

There are several ways to gather further information about the control of mating among

Microdochium and other members of the order Xylariales. First, among M. nivale, it would be valuable to perform additional mating crosses in the lab, especially from a wider variety of sources, in order to more thoroughly assess the conditions under which M. nivale produces ascospores (Lees et al. 1995; Parry et al. 1995). This type of investigation would also prove useful among other species within the order Xylariales, for which there is very limited information available regarding the mating style of most species as homothallic, heterothallic, or unclassified. In cases where mating has been observed and can be reliably induced in the lab, a technique like RNA-Seq could be used to reveal which genes are expressed immediately prior to and during the formation of sexual structures and spores. This investigation could reveal candidate genes that may be responsible for the initiation of sexual reproduction in these species, and may reveal the presence of a MAT1-1-1-like sequence that was overlooked in these investigations.

The numerous attempts to identify mating-type genes among Microdochium and other members of the order Xylariales demonstrate the power of having a genome sequence available.

Whereas various PCR-based strategies were applied to detect MAT-1 sequences among

Microdochium for approximately one year, they were all ultimately unsuccessful and failed to provide any useful results despite consuming a considerable amount of effort and resources.

Alternatively, even the very rough draft genome sequence first obtained for M. majus 99049 provided a MAT1-2-1 sequence within literally minutes following assembly. Moving forward, sequencing the genome of an organism of interest, especially when genome sequences of a 434

closely-related species are unavailable, should be considered for finding genes that are not highly conserved prior to undertaking a project that will include PCR of any sort.

Outside of mating, transposable elements may also present an important source of genetic diversity among fungi, especially pathogens (e.g. (Amyotte et al. 2012; Dean et al. 2005; Hatta et al. 2002; Thon et al. 2006)). Transposable elements were detected among the Microdochium genomes examined (Chapter 3). Transposons in fungi are subject to degradation by a mechanism known as repeat-induced point mutation (RIP) (Selker et al. 1987), which allows the rapid accumulation of mutations within highly repetitive DNA sequences, such as transposons, presumably to prevent them from moving throughout the genome and disrupting functional genes. Interestingly, because TEs are often physically associated with pathogenicity-related genes (such as effectors), this mechanism may encourage the rapid diversification of pathogenicity-related sequences, providing a partial explanation for the high mutation rate of effectors (de Jonge et al. 2011); indeed, the frequency of RIP in non-repetitive regions is correlated with their proximity to repetitive regions (Van de Wouw et al. 2010). In examining the sequenced Microdochium genomes, more than 50% of the putative TEs identified were found to be within 5 kb of a putative pathogenicity-associated gene. A high level of physical proximity between pathogenicity-related genes and highly variable genetic regions, including transposable elements, has been identified among other fungal phytopathogens including Magnaporthe grisea

(Dean et al. 2005), L. maculans (Van de Wouw et al. 2010), and Fusarium oxysporum f. sp. lycoperscei (Ma et al. 2010).

To assess whether the putative pathogen-host interaction (PHI) genes identified in the genome really do play a role in pathogenicity, RNA-Seq could be used to investigate transcriptional differences in these fungi while they are interacting with a host plant, in 435

comparison to when they are growing on artificial media. Among the fungal transcripts that are detected at significantly higher or lower levels in the plant-fungal interaction database, RT-PCR could be used to monitor expression level changes of these genes throughout several points in the infection process. Towards this goal, RNA-Seq data have been collected for M. nivale and M. majus in interaction with T. aestivum and Agrostis stolonifera (creeping bentgrass). The analysis of those data is ongoing.

Fungal effectors are proteins that interact with and can shut down a plant's immune system (Maffei et al. 2012). When fungal effectors are produced, they may be secreted in either the apoplast or inside plant cells either via haustoria or by co-opting existing plant transporters

(Hogenhout et al. 2009); based on the apparent lack of formation of haustoria in the

Microdochium isolates examined, effector secretion by M. nivale and M. majus may be in the apoplast. This could be investigated using in situ hybridization to visualize the localization of these enzymes in combination with microscopic investigations of the type already performed.

In this thesis, a variety of techniques were applied to explore differences between the closely-related fungal plant pathogens Microdochium nivale and M. majus. Traditional phytopathological techniques were combined with tools from modern molecular biology to investigate genetic, genomic, and pathological differences between these two fungal species.

Unexpected trends were detected between, but also within these species, and suggest that future work may reveal further differences between populations that are isolated either geographically or on different host species. In addition, the mating type genes found within these species failed to correlate as predicted with the homothallic mode of reproduction that was observed for M. majus and that has been reported for M. nivale, suggesting that the control of mating in these species, and perhaps in other members of the Xylariales, may involve yet-to-be-identified genes 436

relative to other members of the Sordariales. Together, the results in this thesis demonstrate the complexity that remains to be uncovered even within economically important pathogenic species that have been known to science for over 100 years.

437

7.3 References for Chapter 7

Amyotte, S.G., Tan, X., Pennerman, K., Jimenez-Gasco, M.d.M., Klosterman, S.J., Ma, L.-J., Dobinson, K.F., and Veronese, P. 2012. Transposable elements in phytopahtogenic Verticicllium spp.: insights into genome evolution and inter- and intra-specific disversification. BMC Genomics 13: 314-333. Butler, G. 2007. The Evolution of MAT: The Ascomycetes. In Sex in Fungi: Molecular Determination and Evolutionary Implications. Edited by J. Heitman, Kronstad, J.W., Taylor, J.W., and Casselton, L.A. ASM Press, Washington, D.C. Chen, F., Goodwin, P.H., Khan, A., and Hsiang, T. 2002. Population structure and mating-type genes of Colletotrichum graminicola from Agrostis palustris. Canadian Journal of Microbiology 48: 427-436. de Jonge, R., Bolton, M.D., and Thomma, B.P. 2011. How filamentous pathogens co-opt plants: the ins and outs of fungal effectors. Current Opinion in Plant Biology 14: 400-406. Dean, R.A., Talbot, N.J., Ebbole, D.J., Farman, M.L., Mitchell, T.K., Orbach, M.J., Thon, M., Kulkarni, R., Xu, J.-R., Pan, H., Read, N.D., Lee, Y.-H., Carbone, I., Brown, D., Oh, Y.Y., Donofrio, N., Jeong, J.S., Soanes, D.M., Djonovic, S., Kolomiets, E., Rehmeyer, C., Li, W., Harding, M., Kim, S., Lebrun, M.-H., Bohnert, H., Coughlan, S., Butler, J., Calvo, S., Ma, L.-J., Nicol, R., Purcell, S., Nusbaum, C., Galagan, J.E., and Birren, B.W. 2005. The genome sequence of the rice blast fungus Magnaporthe grisea. Nature 434(980-986). Diamond, H., and Cooke, B.M. 1999. Towards the development of a novel in vitro strategy for early screening of Fusarium ear blight resistance in adult winter wheat plants. European Journal of Plant Pathology 105(4): 363-372. Dubas, E., Golebiowska, G., Zur, I., and Wedzony, M. 2011. Microdochium nivlae (Fr., Samuels & Hallett): cytological analysis of the infection process in triticale (x Triticosecale Wittm.). Acta Physiologiae Plantarum 33: 529-539. Glynn, N.C., Hare, M.C., Parry, D.W., and Edwards, S.G. 2005. Phylogenetic analysis of EF-1 alpha gene sequences from isolates of Microdochium nivale leads to elevation of varieties majus and nivale to species status. Mycological Research 109: 872-880. Hatta, R., Ito, K., Hosaki, Y., Tanaka, T., Tanaka, A., Yamamoto, M., Akimitsu, K., and Tsuge, T. 2002. A conditionally dispensable chromosome controls host-specific pathogenicity in the fungal plant pathogen Alternaria alternaria. Genetics 161: 59-70. Hogenhout, S.A., Van der Hoorn, R.A.L., Terauchi, R., and Kamoun, S. 2009. Emerging concepts in effector biology of plant-associated organisms. MPMI 22(2): 115-122. Lees, A.K., Nicholson, P., Rezanoor, H.N., and Parry, D.W. 1995. Analysis of variation within Microdochium nivale from wheat - evidence for a distinct subgroup. Mycological Research 99: 103-109. Lin, X., and Heitman, J. 2007. Mechanisms of homothallism in fungi and transitions between heterothallism and homothallism. In Sex in Fungi: Molecular Determination and Evolutionary Implications. Edited by J. Heitman, J.W. Kronstad, J.W. Taylor, and L.A. Casselton. ASM Press, Washington, DC. Litschko, L., and Burpee, L.L. 1987. Variation among isolates of Microdochium nivale collected from wheat and turfgrasses. Transactions of the British Mycological Society 89: 252-256.

438

Ma, L.-J., van der Does, H.C., Borkovich, K.A., Coleman, J.J., Daboussi, M.-J., Di Pietro, A., Dufresne, M., Freitag, M., Grabherr, M.G., Henrissat, B., Houterman, P.M., Kang, S., Shim, W.-B., Woloshuk, C., Xie, X., Xu, J.-R., Antoniw, J., Baker, S.E., Bluhm, B.H., Breakspear, A., Brown, D.W., Butchko, R.A.E., Chapman, S., Coulson, R., Coutinho, P.M., Danchin, E.G.J., Diener, A., Gale, L.R., Gardiner, D.M., Goff, S., Hammond- Kosack, K.E., Hilburn, K., Hua-Van, A., Jonkers, W., Kazan, K., Kodira, C.D., Koehrsen, M., Kumar, L., Lee, Y.-H., Li, L., Manners, J.M., Miranda-Saavedra, D., Mukherjee, M., Park, G., Park, J., Park, S.-Y., Proctor, R.H., Regev, A., Ruiz-Roldan, M.C., Sain, D., Sakthikumar, S., Sykes, S., Schwartz, D.C., Turgeon, B.G., Wapinski, I., Yoder, O., Young, S., Zeng, Q., Zhou, S., Galagan, J.E., Cuomo, C.A., Kistler, H.C., and Rep, M. 2010. Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium. Nature 464: 367-373. Maffei, M.E., MArimura, G.-I., and Mithofer, A. 2012. Natural elicitors, effectors and modulators of plant responses. Natural Product Reports 29(11): 1288-1303. Mahuku, G.S., Hsiang, T., and Yang, L. 1998. Genetic diversity of Microdochium nivale isolates from turfgrass. Mycological Research 102: 559-567. Maurin, N., Rezanoor, H.N., Lamkadmi, Z., Some, A., and Nicholson, P. 1995. A comparison of biological, molecular, and enzymatic markers to investigate variability within Microdochium nivale (Fries) Samuels and Hallett. Agronomie 15(1): 39-47. Menat, J., Cabral, A.L., Vijayan, P., Wei, Y., and Banniza, S. 2012. Glomerella truncata: another Glomerella species with an atypical mathing system. Mycologia 104(3): 641-649. Parry, D.W., Rezanoor, H.N., Pettitt, T.R., Hare, M.C., and Nicholson, P. 1995. Analysis of Microdochium nivale isolates from wheat in the UK during 1993. Annals of Applied Biology 126(3): 449-455. Rodriguez-Guerra, R., Ramirez-Rueda, M.-T., Cabral-Enciso, M., Garcia-Serrano, M., Lira- Maldonado, Z., Gevara-Gonzalez, R.G., Gonzalez-Chavira, M., and Simpson, J. 2005. Heterothallic mating observed between Mexican isolates of Glomerella lindemuthiana. Mycologia 97(4): 793-803. Selker, E.U., Cambareri, E., Jensen, B., and Haack, K. 1987. Rearrangement of duplicated DNA in specialized cells of Neurospora Cell 51: 741-752. Simpson, D.R., Rezanoor, H.N., Parry, D.W., and Nicholson, P. 2000. Evidence for differential host preference in Microdochium nivale var. majus and Microdochium nivale var. nivale. Plant Pathology 49(2): 261-268. Simpson, D.R., Thomsett, M.A., and Nicholson, P. 2004. Competitive interactions between Microdochium nivale var. majus, M. nivale var. nivale and Fusarium culmorum in planta and in vitro. Environmental Microbiology 6(1): 79-87. Smith, J.D. 1983. Fusarium nivale (Gerlachia nivalis) from cereals and grasses - is it the same fungus? Canadian Plant Disease Survey 63(1): 25-26. Thon, M., Pan, H., Diener, S., Papalas, J., Taro, A., Mitchell, T., and Dean, R. 2006. The role of transposable element clusters in genome evolution and loss of synteny in the rice blast fungus Magnaporthe oryzae. Genome Biology 7: R16. Vaillancourt, L., Du, M., Wang, J., Rollins, J., and Hanau, R. 2000. Genetic analysis of cross fertility between two self-sterile strains of Glomerella graminicola. Mycologia 92(3): 430-435.

439

Van de Wouw, A.P., Cozijnsen, A.J., Hane, J.K., Brunner, P.C., McDonald, B.A., Oliver, R.P., and Howlett, B.J. 2010. Evolution of linked avirulence effectors in Leptosphaeria maculans is affected by genomic environment and exposure to resistance genes in host plants. PLoS Pathogens 6(11): e1001180. Zur, I.A., Dubas, E., Pociecha, E., Dubert, F., Kolasinska, I., and Plazek, A. 2011. Cytological analysis of infection process and the first defence responses induced in winter rye (Secale cereale L.) seedlings inoculated with Microdochium nivale Physiological and Molecular Plant Pathology 76: 189-196.

440