<<

Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 Incorporating molecular data in fungal : a guide for aspiring researchers

Hyde KD1,2, Udayanga D1,2, Manamgoda DS1,2, Tedersoo L3,4, Larsson E5, Abarenkov K3, Bertrand YJK5, Oxelman B5, Hartmann M6,7, Kauserud H8, Ryberg M9, Kristiansson E10, Nilsson RH 5,*

1Institute of Excellence in Fungal Research, 2School of Science, Mae Fah Luang University, Chiang Rai, 57100, Thailand 3Institute of Ecology and Earth Sciences, University of Tartu, Tartu, Estonia 4Natural History Museum, University of Tartu, Tartu, Estonia 5Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 405 30 Göteborg, Sweden 6Forest Soils and Biogeochemistry, Swiss Federal Research Institute WSL, Birmensdorf, Switzerland 7Molecular Ecology, Agroscope Reckenholz-Tänikon Research Station ART, Zurich, Switzerland 8 Department of Biosciences, University of Oslo, PO Box 1066, Blindern, N-0316, Oslo, Norway 9Department of Organismal , Uppsala University, Norbyvägen 18D, 75236 Uppsala, Sweden 10Department of Mathematical , Chalmers University of Technology, 412 96 Göteborg, Sweden

Hyde KD, Udayanga D, Manamgoda DS, Tedersoo L, Larsson E, Abarenkov K, Bertrand YJK, Oxelman B, Hartmann M, Kauserud H, Ryberg M, Kristiansson E, Nilsson RH. 2013 – Incorporating molecular data in fungal systematics: a guide for aspiring researchers. Current Research in Environmental & Applied Mycology 3(1), 1–32, Doi 10.5943/cream/3/1/1

The last twenty years have witnessed molecular data emerge as a primary research instrument in most branches of mycology. Fungal systematics, , and ecology have all seen tremendous progress and have undergone rapid, far-reaching changes as disciplines in the wake of continual improvement in DNA technology. A taxonomic study that draws from molecular data involves a long series of steps, ranging from sampling through the various laboratory procedures and data analysis to the publication process. All steps are important and influence the results and the way they are perceived by the scientific community. The present paper provides a reflective overview of all major steps in such a project with the purpose to assist research students about to begin their first study using DNA-based methods. We also take the opportunity to discuss the role of taxonomy in biology and the life sciences in general in the light of molecular data. While the best way to learn molecular methods is to work side by side with someone experienced, we hope that the present paper will serve to lower the learning threshold for the reader.

Key words – biodiversity – molecular marker – phylogeny – publication – – taxonomy

Article Information Received 12 March 2013 Accepted 26 March 2013 Published online 29 April 2013 * Corresponding author: R. Henrik Nilsson – email – [email protected]

1 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 Introduction requirement as an effective tool in terms of Morphology has been the basis of time and resource consumption (Horn et al. nearly all taxonomic studies of fungi. Most 1996; Frisvad et al. 2008). species were previously introduced because Molecular data was first used in their morphology differed from that of other taxonomic studies of fungi in the 1970's (cf. taxa, although in many plant pathogenic genera DeBertoldi et al. 1973), which marked the the host was given a major consideration beginning of a new era in fungal research. The (Rossman and Palm-Hernández 2008; Hyde et last twenty years have seen an explosion in the al. 2010; Cai et al. 2011). Genera were use of molecular data in systematics and introduced because they were deemed taxonomy, to the extent where many journals sufficiently distinct from other groups of will no longer accept papers for publication species, and the differences typically amounted unless the taxonomic decisions are backed by to one to several characters. Groups of genera molecular data. DNA sequences are now used were combined into families and families into on a routine basis in fungal taxonomy at all orders and classes; the underlying reasoning levels. Molecular data in taxonomy and for grouping lineages was based on single to systematics are not devoid of problems, several characters deemed to be of particular however, and there are many concerns that are discriminatory value. However, the whole not always given the attention they deserve system hinged to no small degree on what (Taylor et al. 2000; Hibbett et al. 2011). We characters were chosen as arbiters of regularly meet research students who are inclusiveness. These characters often varied unsure about one or more steps in their nascent among mycologists and resulted in molecular projects. Much has been written disagreement and constant taxonomic about each step in the molecular pipeline, and rearrangements in many groups of fungi there are several good publications that should (Hibbett 2007; Shenoy et al. 2007; Yang 2011). be consulted regardless of the questions The limitations of morphology-based arising. What is missing, perhaps, is a wrapper taxonomy were recognized early on, and for these papers: a freely available, easy-to- numerous mycologists tried to incorporate read yet not overly long document that other sources of data into the classification summarizes all major steps and provides process, such as information on biochemistry, references for additional reading for each of production, metabolite profiles, them. This is an attempt at such an overview. physiological factors, growth rate, We have divided it into six sections that span pathogenicity, and mating tests (Guarro et al. the width of a typical molecular study of fungi: 1999; Taylor et al. 2000; Abang et al. 2009). 1) taxon sampling, 2) laboratory procedures, 3) Some of these attempts proved successful. For sequence quality control, 4) data analysis, 5) instance, the economically important plant the publication process, and 6) other pathogen Colletotrichum kahawae was observations. The target audience is research distinguished from C. gloeosporioides based students ("the users") about to undertake a on physiological and biochemical characters (Sanger sequencing-based) molecular (Correll et al. 2000; Hyde et al. 2009). mycological study with a systematic or Similarly, relative growth rates and the taxonomic focus. production of secondary metabolites on defined media under controlled conditions are valuable Sampling and compiling a dataset in studies of complex genera such as The dataset determines the results. If Aspergillus, Penicillium, and Colletotrichum there is anything that the user should spend (Frisvad et al. 2007; Samson and Varga 2007; valued time completing, it is to compile a rich, Cai et al. 2009). In many other situations, meaningful dataset of specimens/sequences however, these data sources proved (Zwickl and Hillis 2002). Right from the onset, inconclusive or included a signal that varied it is important to consider what hypotheses will over time or with environmental and be tested with the resulting phylogeny. If the experimental conditions. In addition, the hypothesis is that one taxon is separate from detection methods often did not meet the one or several other taxa, then all those taxa

2 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 should be sampled along with appropriate Sampling through herbaria, culture outgroups; consultation of taxonomic expertise collections, and the literature – Mining world (if available) is advisable. It is paramount to herbaria for species and specimens of any consider previous studies of both the taxa given genus is not as straightforward as one under scrutiny and closely related taxa to might think. Many herbaria (even in the achieve effective sampling. It is usually a Western countries) are not digitized, and no mistake to think that one knows about the centralized resource exists where all digitized relevant literature already, and the users are herbaria can be queried jointly. GBIF advised to familiarize themselves with how to (http://www.gbif.org/) nevertheless represents search the literature efficiently (cf. Conn et al. a first step in such a direction, and the user is 2003 and Miller et al. 2009). In particular, it advised to start there. Larger herbaria not often pays off to establish a set of core covered by GBIF may have their own publications and then look for other papers that databases searchable through web interfaces cite these core papers (through, e.g., “Cited by” but otherwise are best queried through an email in Google Scholar). If there is an opportunity to to the curator. Type specimens have a special sequence more than one specimen per taxon of standing in systematics (Hyde and Zhang 2008; interest, this is highly preferable and Ko Ko et al. 2011; McNeill et al. 2012), and particularly important when it comes to poorly the inclusion of type specimens in a study defined taxa and/or at low taxonomic levels. In lends extra weight and credibility to the results most cases it will not be enough to use and to unambiguous naming of generated whatever specimens the local herbarium or sequences. Herbaria may be unwilling to loan culture collection has to offer; rather the user type specimens (particularly for sequencing or should consider the resources available at the any other activity that involves destroying a national and international levels. Ordering part of the specimen, i.e., destructive sampling) specimens or cultures may prove expensive, and may offer to sequence them locally instead and the user may want to consider inviting – or may in fact already have done so. researchers with easy access to such specimens Specimens collected by particularly as co-authors of the study. The invited co- professional taxonomists or that are covered in authors may even oversee the local sequencing reference works should similarly be prioritized. of those specimens, to the point where more For cultures, we recommend the user to start at specimens does not have to mean more costs the CBS culture collection for the researcher. When considering (http://www.cbs.knaw.nl/collection/AboutColle specimens for sequencing, it should be kept in ctions.aspx) and the StrainInfo web portal mind that it may be difficult to obtain long, (http://www.straininfo.net/). Not all relevant high-quality DNA sequences from older specimens are however deposited in public specimens (cf. Larsson and Jacobsson 2004). collections; many taxonomists keep personal However, there are other factors than age, such herbaria or cultures. Scanning the literature for as how the specimen was dried and how it has relevant papers to locate those specimens is been stored, that also influence the quality of rewarding. It may be particularly worthwhile to the DNA. There also seem to be systematic try to include specimens reported from differences across taxa in how well their DNA outlying localities, in unusual ecological is preserved over time; as an example, species configurations, or together with previously adapted to tolerate desiccation are often found unreported interacting taxa with respect to to have better-preserved DNA than have short- those specimens already included. Such exotic lived mushrooms (personal observation). There specimens are likely to increase both the are different methods that increase the chances genetic depth and the discovery potential of the of getting satisfactory sequences even from study. Collections from distant locations and single cells or otherwise problematic material substrates other than that of the type material (see Möhlenhoff et al. 2001, Maetka et al. may represent different biological species. 2008, Bärlocher et al. 2010, and Särkinen et al. Although they cannot always be readily 2012). distinguished based on morphology (i.e., they

3 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 are cryptic species) including these taxa will do these searches and what to keep in mind allow for a better understanding of the targeted when interpreting BLAST results are given in species. Kang et al. (2010) and Nilsson et al. (2011, Incorrectly labelled specimens will be 2012). In short, the user must not be tempted to found even in the most renowned herbaria and include too many sequences from the BLAST culture collections. The fact that a leading searches. We advise the user to focus on expert in the field collected a specimen is only sequences that are very similar to, and that a partial safeguard against incorrect cover the full length of, the query sequences. If annotations, because processes other than the user follows this approach for all of their taxonomic competence – such as unintentional newly generated sequences, then they should label switching and culture contamination – not have to worry about picking up too distant contribute to misannotation. However most of sequences that would cause problems in the the herbaria and culture collections welcome alignment and analysis steps. Whether the suggestions and accurate annotations (with sequences obtained through the data mining reasonable verifications) for specimens that are step are annotated with Latin names or not – not identified correctly or are ambiguous. It is a after all, about 50% of the 300,000 public good idea to double check all specimens fungal ITS sequences are not identified to the retrieved and to seek to verify the taxonomic species level (cf. Abarenkov et al. 2010a) – affiliation of the specimens in the sequence does not really matter in our opinion. They analysis steps (sections 3 and 4). Keeping a represent samples of extant fungal biodiversity, digital image of the sequenced specimen is also and they carry information that may prove valuable. essential to disentangle the genus in question. Sampling through electronic resources The user should keep an eye on the – As more and more researchers in mycology geo/ecological/other metadata reported for the and the scientific community sequence fungi new entries to see if they expand on what was and environmental samples as part of their known before. work, the sequence databases accumulate A word of warning is needed on the significant fungal diversity. Even if the users reliability of public DNA sequences. As with know for certain that they are the only active herbaria and culture collections, many of these researchers working on some given taxon, they sequences carry incorrect species names can no longer rest assured that the databases do (Bidartondo et al. 2008) or may be the subject not contain sequences relevant to the of technical problems or anomalies. Sequences interpretation of that taxon. On the contrary, stemming from cloning-based studies of chances are high that they do. As an example, environmental samples are, in our experience, Ryberg et al. (2008) and Bonito et al. (2010) more likely to contain read errors, or to be both used insufficiently identified fungal ITS associated with other quality issues, than sequences in the public sequence databases to sequences obtained through direct sequencing make significant new taxonomic and of cultures and fruiting bodies. The process of geo/ecological discoveries for their target establishing basic quality and reliability of lineages. Therefore, it is recommended that fungal DNA sequences, including cloned ones, sequences similar to the newly generated is discussed further in section 3. sequences should be retrieved in order to Field sampling and collecting – Many establish phylogenetic relationships and verify fungi cannot be kept in culture and so can the accuracy of the sequence data through never be purchased from culture collections. comparison. Most herbaria are biased towards fungal groups This process is simple and amounts to studied at the local university or groups that are regular sequence similarity searches using noteworthy in other regards, such as BLAST (Altschul et al. 1997) in the economically important plant pathogens. The International Sequence Databases public sequence databases are perhaps less (INSD: GenBank, EMBL, and DDBJ; Karsch- skewed in that respect, but instead they Mizrachi et al. 2012) or emerencia manifest a striking geographical bias towards (http://www.emerencia.org) – details on how to Europe and North America (Ryberg et al.

4 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 2009). There are, in other words, limits to the fungal isolation as soon as possible after they fungal diversity one can obtain from resources are collected from the field, otherwise they can already available, such that collecting fungi in be used for direct DNA extraction when still the field often proves necessary. Indeed, fresh. In the case of leaf-inhabiting fungi, the collecting and recording metadata are plant material should be dried and compressed cornerstones of mycology, and it is essential, using standard methods, without any amidst all digital resources and emerging toxicogenic preservatives or treatments. For sequencing technologies, that we keep on plant-pathogenic and other culturable recording and characterizing the mycobiota microfungi, the practice of single-spore around us in this way (Korf 2005). Such isolation is desirable (Choi et al. 1999; voucher specimens and cultures form the basis Chomnunit et al. 2011). Indeed, of validation and re-determination for present monosporic/haploid cultures are recommended and future research efforts regardless of the whenever possible in molecular taxonomic approach adopted. In addition, some studies and several other contexts, such as morphological characters can only be observed sequencing. Heterokaryotic mycelia or or quantified properly in fresh specimens, other structures may manifest heterozygosity alluding further to the importance of field (i.e., co-occurring, divergent allelic variants) sampling. (Descriptions of such ephemeral for the targeted marker(s), which would add an characters should ideally be noted on the unwanted layer of complexity in many collection.) situations. Methods for single spore isolation, Fruiting bodies should be dried through desirable media for initial culturing, and airflow of ≤40 degrees centigrade or through preservation of cultures are equally important silica gel for tiny specimens. The dried factors to consider with respect to material should be stored in an airtight zip-lock improvements of the quality in molecular (mini-grip) plastic bag to prevent re-wetting experiments (see Voyron et al. 2009 and Abd- and access by insect pests. (Insufficiently dried Elsalam et al. 2010). material may be degraded by and Hibbett et al. (2011) made a puzzling moulds, leaving the DNA fragmented and observation on the limits of known fungal contaminated by secondary colonizers.) It is diversity: when fungi from herbaria are recommended to place a piece of the (fresh) sequenced and compared to fungi recovered fruiting body in a DNA preservation buffer from environmental samples (e.g., soil), these such a CTAB (hexadecyl-trimethyl-ammonium groups form two more or less disjoint entities. bromide; PubChem ID 5974) solution; ethanol By sequencing one of them, we would still not and particularly formalin-based solutions know much about the other. Porter et al. (2008) perform poorly when it comes to DNA similarly portrayed different scenarios for the preservation and subsequent extraction (cf. fungal community at a site in Ontario depen- Muñoz-Cadavid et al. 2010). In addition to ding on whether aboveground fruiting bodies fruiting bodies, it is recommended to store any or belowground soil samples were sequenced. vegetative or asexual propagules such as The fact that a non-trivial number of fungi do mycelial mats, mycorrhizae, rhizomorphs, and not seem to form (tangible) fruiting bodies has sclerotia that the user intends to employ for been known for a long time, but it is essential molecular identification. CTAB buffer works that field sampling protocols consider this. One fine for these too. If DNA extraction is planned idea could be to increase the proportion of in the immediate future, samples of collections somatic (non-sexual) fungal structures sampled can be placed into DNA extraction buffer during field trips (cf. Healy et al. 2013). Right already in the field. For fresh samples that have now that proportion may be close to zero, no soil particles, a modified Gitchier buffer based on the last few organized field trips we (0.8M Tris-HCl, 0.2 M (NH4)2SO4, 0.2% w/v have attended. Such collecting should be seen Tween-20) can be used for cell lysis and rapid as a long-term project unlikely to yield results DNA extraction (Rademaker et al. 1998). immediately, but we suggest it would be a With respect to plant-associated good idea if the users, when out collecting for microfungi, plant specimens should be used for some project, would try to collect, voucher,

5 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 and sequence at least one fungal structure they complex (Silva et al. 2012; Weir et al. 2012); would normally have ignored. GAPDH for the genera Bipolaris and Curvularia (Berbee et al. 1999; Manamgoda et Selection of gene/marker, primers, and al. 2012); MS204 and FG1093 for laboratory protocols Ophiognomonia (Walker et al. 2012); and tef- The protocols pertaining to the 1α for Diaporthe (Santos et al. 2010; Udayanga laboratory work come with numerous et al. 2012). For research questions above the decisions, many of which will fundamentally genus level, the user should normally turn to affect the end results. The best way to learn other genes than the ITS region. The nuclear laboratory work is through someone ribosomal large subunit (nLSU) has been a experienced, such that the users do not have to mainstay in fungal phylogenetic inference for make all those decisions on their own. Many more than twenty years – such that a large pitfalls and mistakes can be avoided in this selection of reference nLSU sequences are way. A sound step towards making informed available – and largely shares the ease of choices also involves looking into the amplification with the ITS region. It is literature: what genes/markers, primers, and challenged, and often surpassed, in information laboratory protocols were used by researchers content by genes such as β-tubulin (Thon and who studied the same or closely related taxa Royse 1999), tef-1α (O'Donnell et al. 2001), (with roughly similar research questions in MCM7 (Raja et al. 2011), RPB1 (Hirt et al. mind)? That said, many scientific studies come 1999), and RPB2 (Liu et al. 1999). As a heavily underspecified in the Materials & general observation, single-copy genes are Methods section, and the user should look to it typically more difficult to amplify from small for inspiration rather than for full recipes. After amounts of material and from moderate-quality selecting target marker(s) and appropriate DNA than are multi-copy genes/markers such primers, the user should be prepared to spend as nLSU and ITS (Robert et al. 2011; Schoch time in the molecular laboratory for one to et al. 2012). Genes known to occur as multiple several weeks. The laboratory processes copies (e.g., β-tubulin) in certain fungal involved in a typical molecular may produce misleading systematic study is DNA extraction, a PCR reaction to conclusions based on single-gene analysis amplify the gene(s)/marker(s), examination and (Hubka and Kolarik 2012). DNA sequences of purification of the PCR products, the coding genes can be translated into sequencing reactions, and the screening of the amino acid sequences in a process where each resulting fragments. nucleotide triplet (codon) corresponds to an Choice of gene/marker – It is primarily amino acid. This is advantageous in cases the research questions that dictate what genes where the degree of variability in the multiple or markers to target. For resolution at and nucleotide reaches below the generic level (including species homoplasious levels; the corresponding protein descriptions), the nuclear ribosomal ITS region sequence alignment will be more conserved, is a strong first candidate. The ITS region is and thus less noisy, due to the degenerate typically variable enough to distinguish among nature of the genetic code (e.g., the nucleotide species, and its multicopy nature in genomes triplets CAA and CAG code for the amino acid makes it easy to amplify even from older Q, glutamine). Multiple protein sequence herbarium specimens or in other situations of alignments are particularly useful to trace old low concentrations of DNA. The ITS region is relationships where the phylogenetic signal at the formal fungal barcode and the most the level has been partly eroded sequenced fungal marker (Begerow et al. 2010; over time. The user should be aware that the Schoch et al. 2012). For some groups of fungi, translation from nucleotide sequences to amino other genes or markers give better resolution at acids must be done in the correct reading frame the species level (see Kauserud et al. 2007 and (delimitation of codons) so as not to produce Gazis et al. 2011). As single molecular nonsense combinations of amino acids. markers, GAPDH and Apn2/MAT work well For larger phylogenetic pursuits it is for the Colletotrichum gloeosporioides common, even standard, to include more than

6 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 one unlinked gene/marker in phylogenetic literature is likely to hold clues to what endeavours (James et al. 2006). Phylogenetic lineages require more specialized primers. For species recognition by genealogical example, highly tailored primers are needed for concordance, typically relying on more than the ribosomal genes of the agaricomycete one gene genealogy, has become a tool of genera Cantharellus and Tulasnella modern day systematics of species complexes (Feibelman et al. 1994; Taylor and McCormick of fungi and other (Taylor et al. 2008). In the unlucky event that the standard 2000; Monaghan et al. 2009; Leavitt et al. primers do not work for the targeted by 2011). Indeed, many journals have come to the user – and nobody has developed expect that two or more genes be used in a specialized primers for that gene/marker and phylogenetic analysis, and it may be a good lineage combination already – the user may idea to use, e.g., a ribosomal gene such as have to design new primers based on sequences nLSU and a non-ribosomal gene in molecular in, e.g., INSD. Good software tools are studies. It should be kept in mind that all genes available for this (e.g., PRIMER3 at cannot be expected to work equally well in all http://frodo.wi.mit.edu/; Rozen and Skaletsky fungal lineages, both in terms of amplification 1999 and Primer-BLAST at success and information content. For species http://www.ncbi.nlm.nih.gov/tools/primer- descriptions and phylogenetic inferences of /) but as in data they require some 100+ lesser scope it is typically still deemed base-pairs of the regions immediately upstream acceptable – although perhaps not and downstream of the target region. If these recommendable – to use a single gene, and for upstream and downstream regions are not these purposes we advocate the ITS region as available in the sequence databases, the user the primary marker due to its high phylogenetic has to generate them themselves. Primers can informativeness, ease of amplification, role as also be designed manually based on a multiple the fungal barcode, and the large corpus of ITS sequence alignment (cf. Singh and Kumar sequences already available for comparison. 2001). Upon completing the in silico primer However, if the new species falls outside any design step, the user can turn to software tools known genus, we recommend sequencing the such as ecoPCR nLSU also to provide an approximate (www.grenoble.prabi.fr/trac/ecoPCR; Ficetola phylogenetic position for the new species (or as et al. 2010) to simulate the performance of the a type species of a new genus) within a family new primers on the target region under fairly or order. Knowledge of the rough phylogenetic realistic conditions. Chances are nevertheless position of the new sequence, coupled with high that the user does not need to design any literature, may give clues to what genes are new primers, particularly not if targeting any of likely to perform the best for species the more commonly used genes and markers in identification and subgeneric phylogenetic mycology. A very rich primer array is available inference of the new lineage. for the ribosomal genes (e.g., Gargas and Choice of primers – When amplifying DePriest 1996; Ihrmark et al. 2012; Porter and DNA from single specimens, one typically Golding 2012; Toju et al. 2012), and most does not have to worry about whether or not research groups have detailed primer sections the primer will match perfectly to the template. for both ribosomal and other genes and Even in case of imperfect match, the PCR markers on their home pages (e.g., surprisingly often still comes out successful. http://www.clarku.edu/faculty/dhibbett/protoco Therefore, it is a good idea to try standard ls.html, http://www.lutzonilab.net/primers/in- fungal or universal primers first – such as dex.shtml, and http://unite.ut.ee/primers.php). ITS1F (forward) and ITS4 (reverse) (White et Several publications are available with al. 1990; Gardes and Bruns 1993) in the case of suggestions for selecting primers for widely the ITS region – because chances are high that used molecular markers as well as recently they will work. It is however advisable to use available new markers for specific groups of at least one -specific primer (ITS1F in fungi which are commonly researched by the above) to reduce the chances of amplifying mycologists (Glass and Donaldson 1995; DNA of any co-occurring eukaryotes. The Carbone and Kohn 1999; Schmitt et al. 2009;

7 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 Santos et al. 2010; Walker et al. 2012). potential extraction/PCR inhibitors, such that Laboratory steps – The laboratory decreasing – rather than increasing – the process to obtain DNA sequences can roughly amount of starting material is often the first be divided into five steps: DNA extraction thing one should try in light of a failed DNA from the specimen, PCR, examination and extraction/PCR run (cf. Wilson 1997). When purification of the PCR products, sequencing slow-growing culturable fungi (e.g., some reactions, and fragment analysis/visualisation. bitunicate ascomycetes and marine fungi) are In vitro cloning may sometimes be needed to used in DNA extraction, the user should be pick out the correct PCR fragment for mindful of the substantial time needed for the sequencing, thus forming an additional step. growth of the fungus in order to get a sufficient The last few years have seen an increasing amount of tissue material for DNA extraction. trend of sending the purified PCR products to a One should also be aware of the risk of commercial or institutional sequencing facility contamination in the extraction (as well as for sequencing, leaving the user to oversee subsequent) steps, particularly when working only the three first of the above five steps. The with older herbarium specimens. A good rule is present authors, too, employ such external to never mix fresh and older fungal collections sequencing services, and our conclusion is that when extracting DNA and always to clean the it is cost- and time efficient and that the working area and the picking tools, such as the technical quality of the sequences produced is stereomicroscope and the forceps, thoroughly generally very high. before and in between each round of fungal Some researchers prefer the traditional material. The application of bleach is more CTAB-based way of DNA extraction. In terms efficient than, e.g., UV light and ethanol. It is of quality of results it is a good and cheap important that PCR products never be allowed choice (Schickmann et al. 2011). However, to enter the area where DNA extractions and others find it more convenient to use one of the PCR are performed. Avoidance of many commercially available kits for DNA contamination is particularly important when extraction. Under the assumption that the working with closely related species, as is starting material is relatively fresh and that the often done in taxonomic studies, since such taxa under scrutiny do not contain unusually contaminations are easily overlooked and may high amounts of compounds that affect the be tricky to identify afterwards. DNA extraction/PCR steps adversely, most For the PCR step, many commercial extraction kits are likely to perform satisfactory PCR kits are available on the market. One can at recovering enough DNA to support a PCR expect a standardized, consistent performance run (comparisons of extraction methods are from such kits; moreover they come with available in Fredricks et al. 2005, Karakousis et suggested PCR cycling programs under which al. 2006, and Rittenour et al. 2012). Substrates most target DNA will amplify in a satisfactory with high concentrations of humic acids, way. However, tweaking the PCR protocol notably soil and wood, are known to be may increase the yield significantly, and it is problematic in terms of extraction and well worth consulting with more experienced amplification (Sagova-Mareckova et al. 2008). colleagues and/or any relevant publication. Similarly, high concentrations of One particularly influential factor is the polysaccharides, nucleases, and pigments can annealing temperature of the primer, which is cause interference in extraction of DNA (and often provided with the primer sequence or the subsequent PCR) from some genera of may otherwise be estimated (in, e.g., microfungi (Specht et al. 1982; see also PRIMER3). It is also possible and often Samarakoon et al. 2013). For culturable fungi, beneficial to perform a gradient PCR to a minimal amount (10-20 mg) of actively determine the optimal annealing temperature. growing edges of cultures should be used. Various modifications and optimizations of the Rapid and efficient DNA extraction kits tend to PCR process are discussed in Hills et al. work the best when minimal amounts of tissue (1996), Cooper and Poinar (2000), Qiu et al. are used; the use of low amounts of starting (2001), and Kanagawa (2003). PCR amplicons material serves to reduce the amount of are normally verified for success by agarose

8 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1

Fig. 1 – A Safe-dye stained (SYBR® Safe DNA; 1% agarose gel, 80 V, 20 min) gel profiles showing a successful genomic DNA extraction of fungi. for visual estimation M) Ladders indicating the relative sizes of the DNA (100 bp marker). a Minimum amount of DNA for a PCR b Medium/sufficient amount of DNA c Excess DNA for PCR (quantification and dilution needed prior to the standard PCR reaction.) B A successful amplification of a PCR product stained with Safe dye (1% agarose gel, 80 V, 20 min) a–c a probable case of multiple copies in amplification due to non-specific binding of primers d–f successful amplicons with high concentrations of PCR products g negative control gel electrophoresis, with a gel stained with primers and unpaired ; many ethidium bromide (EtBr), where we can commercial kits are available for this purpose. observe the DNA by detecting the fluorescence These can range from single-step reactions to of the EtBr under UV light (EtBr is an multiple-step, highly effective procedures. One Intercalating Agent, which wedges itself into of the simplest, cheapest, and most widely used the grooves of DNA and stays there. More base approaches is a combination of exonuclease pairs result more grooves, which in turn means and Shrimp alkaline phosphatase enzymatic more EtBr can insert itself). Recently, treatment (Hanke and Wink 1994). Before alternative staining agents that are less toxic sending the purified PCR products for than ethidium bromide have been marketed as sequencing, they may need to be quantified for alternatives in safe handling (e.g., Goldview DNA content (depending on the sequencing (Geneshun Biotech.) and Safe DNA Dye/ facility). A rough quantification can be made SYBR® Safe DNA Gel). These markers use from the strength of the band during PCR UV light or other wavelengths for visualisation, possibly by comparing to visualisation, and any new laboratory should samples of standard concentrations. Special carefully consider which system to use for DNA quantifiers, usually relying on (fluoro-) detection of successful PCR products. It should spectrophotometry, can be used for more exact be remembered that these methods may detect quantification. The sequencing facilities will levels of DNA that, however low, may still usually perform the sequencing reactions suffice for successful sequencing, but it is themselves to get optimal, tailored usually a money and time saving effort to only performance on their sequencing machine. proceed with PCR products that produce a single, clearly visible band (Figure 1). Positive Sequence quality control and negative PCR controls should always be The responsibility to ensure that the employed. newly generated sequences are of high The (successful) PCR products should authenticity and reliability lies with the user. then be purified to remove, e.g., residual There are many examples in the literature

9 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 where compromised sequence data have lead to represent heterozygous sites that should be poor results and unjustified conclusions (cf. coded using the corresponding IUPAC codes Nilsson et al. 2006), suggesting that the quality (e.g., C/T = Y). In the case of multiple control step should not be taken lightly. Two heterozygous sites, it may be necessary with an points at which to exercise quality control is extra cloning step in the lab protocol to during sequence assembly and once the separate the different copies. When sequencing consensus sequence that has been produced PCR amplicons derived from dikaryotic or from the sequence assembly is ready. heterokaryotic tissue/mycelia, some sequence Sequence assembly – Many sequences contigs may shift from high quality to nonsense in systematics and taxonomy are generated due to the presence of an indel (insertion or with two primers – one forward and one deletion) in one of the alleles. In the case of reverse – such that the target sequence is only one indel present, one may obtain a usable effectively read twice. This dual coverage contig sequence by sequencing the fragment in brings about a mechanism for basic quality both directions. control of the read quality of the consensus A note on cloned sequences – When sequences. The primer reads returned from the performing Sanger sequencing of PCR sequencing machine should be assembled in a amplicons derived directly from fruiting body sequence assembly program into a contig, from tissue or mycelia, most PCR errors will which the final sequence is derived (Miller and normally not surface because the resulting Powell 1994). Although sequence assembly is chromatograms represent the averaged signal a semi-to-fully automated step in programs from numerous original templates. However, such as Sequencher (http://genecodes.com/), when cloning, a single PCR fragment is picked, Geneious (http://www.geneious.com/), and multiplied, and sequenced. This means that any Staden (http://staden.sourceforge.net/), the polymerase-generated errors will become results must be viewed as tentative and need visible and have to be controlled for. The only verification. The user should inspect each reasonable guard against such PCR generated contig for positions (bases) of substandard errors is cloning and sequencing replicate appearance. During the sequencing process, the fragments from the same PCR reaction. A sequencing machine quantifies the light unique mutation appearing in only one of the intensity of the four terminal (dyed) sequences most likely represents a PCR- nucleotides at each position, and the relative generated error and should be omitted from the intensity is represented as chromatogram resulting consensus sequence. If some curves in each primer read (Figure 2a). mutations approach a 50/50 ratio in the Occasionally the assembly software struggles replicate sequences, they most likely represent to reconcile the chromatograms from the two allelic variants and should be analyzed reads, leaving the bases incorrectly determined, separately in the further phylogenetic analyses. undecided (as IUPAC DNA ambiguity symbols Although more expensive, the implementation such as “N” and “S”, see Cornish-Bowden of high-fidelity polymerase with high 1985), or in the wrong order. The user should accuracy is especially important when scan the contig and unpaired sequences along sequencing cloned fragments. their full length for such anomalies, most of Quality control of DNA sequences – To which can be identified through the odd exercise quality control through the sequence appearance of the chromatogram curves in assembly step, while very important, is only a those positions (Figure 2b). The distal (5’ and part of the quality management process. There 3’) ends of contigs are nearly always of poor are many kinds of sequence errors and pitfalls quality, and the user should expect to have to that cannot be addressed during sequence trim these in all contigs. There may also be assembly. Nilsson et al. (2012) listed a set of ambiguities in the chromatogram if there are guidelines on how to establish basic different copies of the DNA region in the authenticity and reliability for newly generated sequenced material. This may show as twin (or, for that matter, downloaded) fungal ITS curves for different bases at the same site sequences. Five relatively common sequence (Figure 2c). In most cases such twin curves problems were addressed: whether the

10 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1

a

b

c

d

Fig. 2 – a Clear, high-quality chromatogram curves. The software interprets curves like these with ease. b Correct base-calling is hard, both for the software and for the user, when the chromatogram curves look like this. In most cases it is better to re-sequence the specimen than to try to salvage data from such chromatograms. c If there is more than one (non-identical) copy of the marker amplified, the chromatogram curves tend to look like this. In the middle of the image, the uppermost primer read has produced a “TCAA” whereas the second primer produced “TTGA”. The reads appear clear and unequivocal, such that poor read quality is not a likely explanation for the discrepancy. d Base-calling tends to be hard in homopolymer-rich regions, and one often finds that regions after the homopolymer-rich segments are less well read. All screenshots generated from Sequencher® version 4 sequence analysis software, Gene Codes Corporation, Ann Arbor, MI USA (http://www.genecodes.com).

11 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 sequence represents the intended gene/marker, something of a research field in its own right. whether the sequence is given in the correct A phylogenetic analysis should not involve a orientation, whether the sequence is chimeric, few clicks on the mouse to produce a tree; whether the sequence manifests tell-tale signs rather it is a process that involves many of other technical anomalies, and whether the decisions and that should be given significant sequence represents the intended taxon. In thought. Many software packages in the field short, the user should never assume any newly of phylogenetic inference are complex and generated sequences to be of satisfactory command-line driven – intimidating, perhaps, quality; rather, the user should take measures to some biologists. But instead of resorting to to ensure the basic reliability of the sequences. simpler click-to-run programs, the user should Such measures do not have to be complex, carefully read the respective documentation time-consuming, or computationally expensive. and query the literature for the difference Nilsson et al. (2012) computed a joint multiple between the approaches and options. It is also sequence alignment of their entire query ITS worth asking for expert help or even inviting sequence dataset and located the conserved pertinent researchers as co-authors to the 5.8S gene of the ITS region in all sequences of project. That is how important it is to get the the alignment. That approach verified that all analysis part right! Suboptimal methodological sequences were ITS sequences and that they choices beget less – or incorrectly – resolved were given in the correct orientation. Each phylogenies and in the end may lead to query sequence was then subjected to a erroneous conclusions. This is in nobody’s BLAST search in INSD, and by examining the interest. graphical BLAST summary as well as the full In some cases – particularly near the BLAST output, the authors were able to rule species level – a may not be out the presence of bad chimeras and the most suitable model for representing the sequences with severe technical problems. relations among the taxa in the query dataset Finally, for all query sequences with some sort (due to, e.g., intralocus recombination, of taxonomic annotation (e.g., “Penicillium hybridization, and introgression). In these sp.”), the authors examined the most cases, a network (Kloepper and Huson 2008) significant BLAST results for clues that the may be a better analytical solution. Networks name was at least approximately right; if a can be used to visualize data conflicts in tree sequence annotated as Penicillium would not reconstruction (split networks, e.g., Bryant and produce hits to other sequences annotated as Moulton 2004; Huson and Bryant 2006; Huson Penicillium (accounting for synonyms and et al. 2011) or explicitly represent phylogenetic anamorph-teleomorph relationships), then relationships (reticulate networks, e.g., Huber something would almost certainly be wrong. It et al. 2006; Kubatko 2009; Jones et al. 2013). should be kept in mind that the public sequence The user should also be aware of an ongoing databases contain a non-trivial number of paradigm shift in evolutionary biology, where compromised sequences, suggesting that the single-gene trees and concatenation approaches guidelines – or other means of quality control – are replaced by species tree thinking (Edwards should be applied also to all sequences 2009). Although the distinction between gene downloaded from such resources. The and species trees is not new (Pamilo and Nei guidelines suggested do not form a 100% 1988; Doyle 1992), most species phylogenies guarantee for high-quality DNA sequences, but published to date are based on single gene trees they are likely to result in a more robust and or trees based on concatenation of two or more reliable dataset. alignments from different linkage groups. Recently, however, models and methods have Alignment and phylogenetic analysis been developed to account for the fact that Systematics and tree-based thinking go gene trees can differ in topology and branch hand in hand, and we urge the reader to employ lengths due to population genetics processes a phylogenetic approach even when describing that are best modelled with the multispecies a single new species. It should be stressed, coalescent (Rannala and Yang 2003). In such a though, that phylogenetic inference is framework, gene trees representing unlinked

12 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 parts of the cellular genomes themselves known as they may be. We recommend any of represent data for the species trees (Liu and MAFFT (Katoh and Toh 2010), Muscle (Edgar Pearl 2007). Recent advances in sequencing 2004), and PRANK (Löytynoja and Goldman technologies and whole-genome sequence 2005) for large or otherwise non-trivial information have greatly facilitated the sequence datasets. Unless the number of sampling of multiple genes (Ebersberger et al. sequences reaches several hundreds, these 2012). Interestingly, species trees also enable programs can all be run in their most advanced objective species delimitations based on mode (e.g., “linsi” in the case of MAFFT) on a molecular data (O’Meara 2010; Fujita et al. regular desktop computer. The product is a 2012). Species trees can be inferred directly multiple sequence alignment file, usually in the from sequence data using, e.g., *BEAST FASTA format (Pearson and Lipman 1988). It (Heled and Drummond 2010). is however important to recognize that manual At this stage, the user should facilitate inspection of alignment files is always needed downstream analyses by giving all of the new and manual adjustment is often warranted sequences short, unique names that feature (Figure 3a). This involves loading the only letters, digits, and underscore. It is good alignment in an alignment viewer such as practice to start the name with a unique SeaView (Gouy et al. 2010) and trying to find identifier of the sequence/specimen (since and correct for instances where the multiple some programs truncate the names of the sequence alignment program seems to have sequences down to some pre-defined length); performed suboptimally (Figure 3b,c). Exact this may then be followed by a more guidelines on how to do this are hard to give descriptive name, such as a Latin binomial. A (cf. Gonnet et al. 2000; Lassmann and good sequence name is Sonnhammer 2005), and the user is advised to SM11c_Amanita_gemmata; in contrast, the sit down with someone experienced with this default names of sequences downloaded from process to have it demonstrated. Alternatively, INSD (and that feature, e.g., pipe characters the user could send the alignment for and whitespace) may cause problems in many improvement to someone experienced and then software tools relevant to alignment and contrast the two versions. Occasionally, phylogenetic inference. alignments feature sections that defy all Multiple sequence alignment – Upon attempts at reconstruction of a meaningful completing the quality management of the alignment (Figure 3d). Such sections should be newly generated sequences, the user hopefully kept in the alignment and may be excluded has a well-founded idea of what taxa to use as from the subsequent phylogenetic analysis. outgroups. The choice of is of There are several software solutions that substantial importance and should be done as attempt to formalize this procedure (e.g., thoroughly as possible (cf. de la Torre-Bárcena Talavera and Castresana 2007; Liu et al. 2009; et al. 2009). Ideally, and based on the Capella-Gutierrez et al. 2009). It should be literature, at least three progressively more noted, though, that there may be distantly related taxa (with respect to the unambiguously aligned subsets of sequences ingroup) should be chosen as tentative (taxa) in such globally unalignable regions, and outgroups, although it is the alignment step that these sub-alignments may contain useful decides what sequences appear suitable as information. outgroups and what sequences appear too Phylogenetic analysis – There are many distant. One regularly sees scientific different approaches to phylogenetic analysis, publications that employ too distant outgroups; ranging from distance-based through this translates into alignment problems and parsimony to maximum likelihood and potentially compromised inferences of Bayesian inference (cf. Hillis et al. 1996, phylogeny. Felsenstein 2004, and Yang and Rannala There are many software tools available 2012). For highly coherent, low-homoplasy for multiple sequence alignment, but we urge datasets, most approaches are likely to produce the user to go for a recent (and readily updated) similar results. That said, there is no single, one rather than relying on old programs, well- optimal choice of analysis approach for all

13 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 datasets, and it is a good idea for the user to Standard model testing programs such as start exploring their data using different jModelTest (Posada 2008) and ProtTest methods. A fast, up-to-date program for (Darriba et al. 2011; for protein sequences) can parsimony analysis is TNT (Goloboff et al. be used to select models of molecular 2008). If the users were to browse through for each partition if the correct partitioning various recent phylogeny-oriented publications scheme is known in advance. However, the in high-profile journals, they would probably best partitioning scheme can be difficult to come to the conclusion that Bayesian inference know a priori because it depends on a large and (particularly for large datasets) maximum range of factors. In these cases, partitioning likelihood, both of which are explicitly model- schemes and models of based (parametric), have a lot of momentum at can be co-estimated using PartitionFinder present. For one thing, it is widely accepted (Lanfear et al. 2012). Gaps are normally treated that they are better able to correct for spurious as missing data by phylogenetic inference effects caused by long-branch attraction, which programs; if the user feels that some gap- is a problem of major concern in phylogenetic related characters are particularly inference (Bergsten 2005). Indeed, some phylogenetically informative (and especially if journals are likely to require an analysis using the sequences under scrutiny are very similar one of these two methods. There are many such that informative characters are thinly different programs for phylogenetic analysis; a seeded), the gap-related information can be good list is found at added as a recoded, separate partition (see for http://evolution.genetics.washington.edu/phylip example Simmons and Ochotorena 2000). /software.html. Among the more popular tools Some approaches try to reconstruct the for Bayesian analysis are MrBayes (Ronquist insertion/deletion and substitution processes et al. 2012), BEAST (Drummond et al. 2012), simultaneously (Suchard and Redelings 2006; and PhyloBayes (Lartillot et al. 2009); for Varón et al. 2010). maximum likelihood, RAxML (Stamatakis As datasets grow beyond some 20 2006), GARLI (Zwickl 2006), PhyML sequences, there are no analytical solutions for (Guindon et al. 2009), and FastTree (Price et finding (guaranteeing) the best phylogenetic al. 2010) see heavy use. Many of these tree for any optimality criterion. Instead, phylogenetic analysis programs are heuristic searches – that is, searches that do not furthermore available as web services on guarantee that the optimal solution is found – portals such as CIPRES are employed to infer the tree or a distribution (http://www.phylo.org/portal2/) and BioPortal of trees. The most popular tools for Bayesian (http://www.bioportal.uio.no/), which inference of phylogeny rely on Markov Chain overcomes problems with limited Monte Carlo (MCMC) sampling, a heuristic computational capacity of most personal sampling procedure where one to many computers. When using explicitly model-based searches (“chains”) traverse the space of approaches, it is important to consider how the possible trees (typically fully parameterized molecular evolution should be modelled, i.e., with, e.g., branch lengths) one step at the time, how the changes in the current sequence data comparing the next tree (step ahead) only to have evolved. The above model-based the tree it presently holds in memory (Ronquist programs allow, to various degrees, for and Huelsenbeck 2003). There are several implementation of separate models for statistics that can be calculated through different partitions (typically: each software utilities such as Tracer and AWTY to gene/marker, or each codon position in protein test if the chain has reached a steady state, i.e., coding genes) of the dataset. For example, in if the chain has converged to a set of stable the case of the ITS region, the user should be results for which significant improvements do aware that the two spacer regions (ITS1 and not seem possible (Rambaut and Drummond ITS2) and the intercalary 5.8S gene usually 2007; Nylander et al. 2008). The set of trees differ dramatically in substitution rate, so it (and parameters) inferred is then used to may be appropriate to use different models of compute a majority-rule consensus tree with nucleotide evolution for each of these. branch lengths and support values. At a more

14 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1

Figs 3a–c – a A satisfactory multiple sequence alignment run through MAFFT and then edited manually. The alignment is a subset of the 55-taxon alignment of Ghobad-Nejhad et al. (2010). b Manual editing of a multiple sequence alignment in SeaView. The original alignment is given to the left, with the modified version to the right. c Manual editing of a multiple sequence alignment in SeaView. The original alignment is given on top, with the modified version at the bottom with minimal gaps and ambiguities. 15 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 general level, there are methods to evaluate the al. 2004). reliability of tree topology and individual Interpretation of phylogenetic results branches for all major approaches to and trees can be surprisingly tricky (Hillis et al. phylogenetic inference. Branch support should 1996; Felsenstein 2004). The first thing the routinely be estimated – regardless of approach user should check is that the sequence used as – and reported on in the subsequent an outgroup really is an appropriate outgroup publication. The most common measures of with respect to the ingroup sequences. The branch support are Bayesian posterior answer tends to come naturally when several probabilities (BPP; applicable in Bayesian progressively more distant sequences with inference) and, in respect to the ingroup are included in the distance/parsimony/likelihood-based alignment. Sequences of low read quality or of inferences, non-parametric methods a chimeric nature tend to be found on such as traditional bootstrap (Felsenstein 1985) unusually long branches or as isolated sister and jackknife (Farris et al. 1996). Bayesian taxa to larger , and a second look at such posterior probabilities are estimated from the sequences is always warranted (cf. Berney et proportion of trees exhibiting the in al. 2004). Branches that do not receive question from the posterior distributions significant, but rather modest, support can generated by the MCMC simulation and usually be thought of as non-existent such that represent the probability that the corresponding what they really depict is the state of “no clades are true conditional on the model and resolution available”; to draw far-reaching the data. Bootstrap/jackknife values are conclusions for such modestly supported clades obtained from iterated resampling/dropping of is wishful thinking, that is, something the user characters from the multiple sequence should stay clear of. What constitutes alignment and re-running the phylogenetic “significant” branch support is a non-trivial analysis for each new alignment; the question, though, and the user is advised to bootstrap/jackknife values then represent the focus on clades that appear strongly supported proportion of times the corresponding clades (e.g., more than 80% bootstrap/jackknife or were recovered from these perturbed more than 0.95 BPP). In the context of clades alignments. and branches, it is tempting to identify some We argue that branch lengths should taxa as “” to others, but nearly all always be indicated in phylogenetic trees. In phylogenetic uses of the word “basal” are the context of parsimony, branch lengths conceptually flawed (Krell and Cranston represent the minimum number of mutations 2004), and the user is best off avoiding it (steps) separating two nodes, whereas in altogether. The “sister clade/taxon” construct is maximum likelihood/Bayesian inference the most straightforward alternative. branch lengths are normally given in the unit of Presenting phylogenetic results – The expected changes per site. Any extreme branch end product of a phylogenetic analysis is lengths observed for any of the taxa should be typically a tree in the (text-based) Newick explored for mistakes in the alignment, (http://en.wikipedia.org/wiki/Newick_format) sequences of poor quality, inclusion of taxa or Nexus (Maddison et al. 1997) formats. This that are not closely related to the other taxa, file can be loaded into tree viewing programs and increased evolutionary rate along that such as FigTree branch. Such long-branched taxa call for a re- (http://tree.bio.ed.ac.uk/software/figtree/), evaluation of the alignment and possibly also manipulated, and saved in a graphics format the taxon sampling; long branches represent (preferably a vector-based format such as .svg, one of the most difficult problems in .emf, or .eps). This file, in turn, can be loaded phylogenetic inference (Bergsten 2005). If the into, e.g., Corel Draw or Adobe Illustrator user combines two or more genes from the (GIMP and InkScape are free alternatives) for same linkage group (e.g., the mitochondrial further processing, e.g., cleaning up the taxon genome) in the alignment, it is customary to names and highlighting focal clades (Figure 4). test the dataset for conflicts prior to There is nothing wrong with presenting a undertaking the phylogenetic analysis (Hipp et phylogenetic tree in a straightforward, non-

16 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1

Fig. 3d – A portion of a multiple sequence alignment that defies scientifically meaningful global alignment. The first half of the alignment looks satisfactory, but the quality rapidly deteriorates midway through the alignment. Trying to shoehorn these sequence data into some sort of joint aligned form is sure to produce artificial, non-biological results. Such highly variable regions should be kept in the alignment for reference but excluded from the subsequent phylogenetic analysis. If and when the sequences are exported from the alignment, the user should make sure to employ the “Include all characters” option before exporting to avoid exporting only the parts used in the phylogenetic analysis. While the right half of the alignment is not fit for joint alignment covering all of the query sequences, it is clearly not without signal at a lower level, i.e., for subsets of the sequences. Tools such as trimAL (Capella-Gutierrez et al. 2009) can be used to identify divergent alignment sections. embellished style, particularly not for trees Most studies employing phylogenetic with a limited number of sequences. Many analysis are heavily underspecified in the researchers nevertheless prefer to take their Materials & Methods section and, worse, do trees to the next level by, e.g., mapping not provide neither the multiple sequence morphological characters onto clades, alignment nor the phylogenetic trees derived as indicating generic boundaries with colours, and files (cf. Leebens-Mack et al. 2006). In our collapsing large clades into symbolic units. opinion, a good study specifies details on the iTOL (Letunic and Bork 2011), Mesquite multiple sequence alignment (e.g., number of (Maddison and Maddison 2011), and sites in total, constant sites, and parsimony OneZoom (Rosindell and Harmon 2012) are informative sites); on all relevant/non-default powerful tools for such purposes. The Deep settings of the phylogeny program employed; Hypha issue of Mycologia (December 2006) or and on the phylogenetic trees produced. All the iTOL site (http://itol.embl.de/) may serve software packages should be cited with version as sources of inspiration on how trees could be number. The user should always bundle the manipulated and processed to facilitate multiple sequence alignment and the tree(s) interpretation and highlight take-home with the article through TreeBase (http:// messages. www.treebase.org; Sanderson et al. 1994),

17 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1

Fig. 4 – Maximum likelihood phylogenetic reconstruction of the Tuberaceae phylogeny based on ITS, nLSU, EF1α, and RPB2. ML bootstrap values and Bayesian posterior probabilities are given above and below the branches, respectively. Branches in bold indicate ML bootstrap support >70 and posterior probabilities of 1.0. Further symbols represent reconstructed ancestral host plant associations; spore ornamentation; type specimens; and species of particular economic importance. The phylogeny is rooted with taxa from the Helvellaceae. Figure from Bonito et al. (2013) and used with kind permission.

18 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 DRYAD (http://datadryad.org/; Greenberg minded peers to read the early draft and to 2009), or even as online supplementary items instruct these on what kind of feedback is to the article (as applicable). This makes wanted (ranging from, say, “overall subsequent data access easy for the scientific impression” to “very detailed”). Several community (cf. Mesirov 2010) and helps dispel increasingly detailed rounds of drafts and the old assertion of taxonomy as a secretive, feedback are normally needed. Three good esoteric discipline. ways to increase the chances of having the Submission of data to INSD – As part manuscript accepted in the end is to have at of the publication process, nearly all journals least two (external if possible) colleagues read require that newly generated DNA sequence through the final draft (submission candidate) data be deposited in INSD. Several submission well ahead of submission; to make sure that the options are available to the user (see language used is impeccable; and to follow the http://www.ncbi.nlm.nih.gov/genbank/submit/ instructions of the target journal down to the and very pixel. Many journals are flooded with http://www.ebi.ac.uk/ena/about/embl_bank_su submissions and are only too happy to reject bmissions for details). Submission to INSD manuscripts if they deviate ever so slightly tends to be a time-consuming step that should from the formally correct configuration. A few not be saved to the last minute. The users general considerations follow below. should make it a habit to annotate their Choice of target journal – The user sequences as richly as possible, in spite of the should decide upon the primary target journal fact that it is tempting to provide as little before writing the first word of the manuscript. information as possible just to finish the The second- and third-choice journals should submission procedure quickly. ideally be chosen to be close in scope and style with respect to the primary one, so that the user The publication process would not have to spend significant time The scientific publication is something restructuring or refocusing the manuscript if of the common unit of qualification in the the primary journal rejects it. The scope of the natural sciences and the end product of many journal dictates how the manuscript should be scientific projects. Having spent considerable written: if it is a more general (even non- time with the data collection and analysis, the mycological) journal, the user should probably user may be tempted to rush through the focus on the more general, widely relevant writing phase just to get the paper out. This is aspects of the results. General journals have the usually a mistake, because publishing tends to advantage of reaching a broader audience than be more difficult, and to require more of the taxonomy-oriented journals, and if the users authors, than one perhaps would think. Indeed, feel they have the data to potentially warrant it is not uncommon for a project to take two or such a choice then they should certainly try. more years from conception to its final, However, trying to shoehorn smaller published state. Even very experienced taxonomic papers into more general journals is researchers struggle with the writing phase, and likely to prove a futile exercise. There is we advise the user to start writing as early as nothing wrong with publishing in more possible. To compile a first complete draft – restricted journals, and to be able to tell the however bad – of the manuscript is usually the difference between a manuscript with the key to wrapping up a paper in a reasonable potential for a more general journal and a time. Some researchers prefer to start the manuscript that probably should be sent to a writing process from a mind map or a bulleted more restricted journal is a skill that is likely to list of things to cover in the text; others prefer save the user considerable time and energy. to simply start writing (Torrence et al. 1994). The user should also be prepared to be Already the second or third draft is probably rejected: this is a part of being in science and good enough to hand out to one or more peers not something that should be taken too /co-authors for discussion. Feedback is crucial personally. That said, if rejected, the user (Caffarella and Barnett 2000); said authors should scrutinize the rejection letter for clues to recommend choosing reasonably kind or like- how the manuscript could be improved. The

19 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 user should make it a habit to try to implement version of the manuscript submitted to the at least the easiest, and preferably several journal, i.e., “pre” and “post” review) available more, of those suggested changes in the as a Word or PDF file on their personal manuscript before submitting it to the next homepages or certain public repositories such journal in line. as arXiv (http://arxiv.org/). The Many funding agencies and SHERPA/RoMEO database at institutional rankings ascribe extra weight to http://www.sherpa.ac.uk/romeo/ has the full publications in journals that are indexed in the details on what the major publishing ISI Thompson Web of Science companies/journals allow the authors to do (http://thomsonreuters.com/), i.e., publications with their pre- and post-print manuscripts. To in journals that have, or are about to get, a make a manuscript publicly available in pre- or formal impact factor postprint form, in turn, qualifies as “open (http://en.wikipedia.org/wiki/Impact_factor). access” as far as many funding agencies are Citations are often quantified in an analogous concerned. Thus, if the user needs to publish manner. This makes it a good idea to seek to open access but cannot afford it, this may be target journals with a formal impact factor (or the way to go. at least journals with an explicit ambition to There are some data to suggest that obtain one), more or less irrespectively of what open access papers may be cited more often that impact factor is. We are under the than non-open access ones (MacCallum and impression that impact factors are falling out of Parthasarathy 2006), although the first favour as a bibliometric measurement unit of integrative comparison based on mycological scientific quality – which would perhaps papers has yet to be undertaken. Most reduce the incentive for seeking to maximize scientists, we imagine, are attracted to the ideas the impact factor in all situations – but if the of openness, distributed web archiving, and of user has a choice, it may make sense to strive the dissemination of their results also to those for journals that have an impact factor of 1 or who do not have access to subscription-based higher. If given the choice between a journal journals. Not all is gold that glitters, however. run by a society and a journal run by a An increasing number of open access commercial publishing company – with publishers may not have the authors’ best otherwise approximately equal scope and interest in mind but are rather run as strictly impact – we would go for the one maintained commercial enterprises, often without direct by the society. A recent overview of journals participation of scientists. A Google search on with a full or partial focus on mycology is “grey zone open access publishers” will provided by Hyde and KoKo (2011). produce lists of journals (and publishers) that Open access – An increasing number of the user may want to stay clear of. The papers funding agencies require that papers resulting are open access, but the peer review procedure from projects which they have funded be tends to be less than stringent, and the journals published through an open access model, i.e., seem to take little, if any, action to promote the freely downloadable (cf. http://www.doaj.org/). results of the authors. Many of the journals are As a consequence, the number of open access very poorly covered in literature databases and journals has exploded during the last few years. are unlikely to ever qualify for formal impact Similarly, most non-open access journals with factors. It would seem probable that such subscription fees now offer an “open choice” publications would detract from, rather than alternative where individual articles are made add to, ones CV. open access online. In both cases, there is normally a fee involved, and the fee tends to be Other observations sizable (e.g., US$1350 for PLoS ONE, Taxonomy is sometimes referred to as a US$1990 for the BMC series, and US$3000 for discipline in crisis (Agnarsson and Kuntner many Elsevier articles as of October 2012). 2007; Drew 2011). There is some truth to such Less well known is perhaps that most major claims, because the number of active publishing companies allow the authors to taxonomists is in constant decline. Taxonomy make pre- or post-prints (the first and the last furthermore struggles to obtain funding in

20 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 competition with disciplines deemed more phylogenies produced by taxonomic cutting-edge. However, genome-scale sequence researchers. data is already accessible for non-model There is nevertheless a disconnect organisms at a modest cost, and it can be between NGS-powered environmental anticipated that the standard procedures to sequencing and taxonomy. Most environmental obtain the molecular data described in this sequencing efforts recover a substantial paper will in part be complemented and even number of operational taxonomic units (Blaxter replaced by high-throughput sequencing et al. 2005) that cannot be identified to the techniques at similar costs or less at some point level of species, genus, or even order (cf. in the not-too-distant future. This will require Hibbett et al. 2011). However, since NGS substantial bioinformatics efforts and data sequences are not readily archived in INSD storage capabilities, but it also opens enormous (but rather stored in a form not open to direct possibilities for new scientific discoveries as query in the European Nucleotide Archive; well as more advanced and formalized models Amid et al. 2012), these new (or at least to trace phylogenies and the speciation process. previously unsequenced) lineages are not Here we discuss some of the challenges faced immediately available for BLAST searches and by taxonomy in light of the project pursued by thus generally ignored. Similarly there is no our imaginary user. device or centralized resource to database the Sanger versus next-generation fungal communities recovered in the many sequencing – The last seven years have seen NGS-powered published studies, and most dramatic improvements of, and additions to, opportunities to compare and correlate taxa and the assortment of DNA sequencing communities across time and space are simply technologies. Collectively referred to as next- missed. Partial software solutions are being generation sequencing (NGS), these new worked upon in the UNITE database methods can produce millions – even billions – (Abarenkov et al. 2010a,b) and elsewhere, but of sequences in a few days. At the time of we have no solution at present. It would be writing this article, most NGS technologies on beneficial if all studies of eukaryote/fungal the market produce sequences of shorter length communities were to involve at least one than those obtained through traditional Sanger fungal systematician/taxonomist. Conversely, sequencing, but this too is likely to change in fungal systematicians/taxonomists should seek the near future. The user should keep in mind, training in bioinformatics and biodiversity though, that the NGS techniques and Sanger informatics to become prepared to deal with sequencing are used for different purposes. the research questions involving taxonomy, NGS methods are primarily used to sequence systematics, ecology, and evolution in an genomes/RNA transcripts and to explore increasingly connected, molecular world. environmental substrates such as soil and the How can we do systematics and human gut for diversity and functional taxonomy in a better way? – An important part processes. To date there is no NGS technique of environmental sequencing studies is to give to fully replace Sanger sequencing for regular accurate names to the sequences (Lindahl et al. research questions in systematics and in press). By recollecting species, designating taxonomy (neither in terms of read quality nor epitypes, depositing type sequences in focus on single specimens), so it cannot be GenBank, and publishing phylogenetic claimed that the present user employs studies/classifications of fungi, taxonomists “obsolete” sequencing technology. On the provide a valuable service for present and contrary, we feel the user should be future scientific studies. For example, it was congratulated for doing phylogenetic previously impossible to accurately name taxonomy, for even the most cutting-edge NGS common endophytic isolates of Colletotrichum, efforts require names and a classification Phyllosticta, Diaporthe, or Pestalotiopsis system from systematic and taxonomic studies through BLAST searches, as the chances were in the form of, e.g., lists of species and higher very high that the names tagged to sequences groups recovered in their samples. Underlying in INSD for these genera were highly those lists are reliable reference sequences and problematic (see KoKo et al. 2011). However,

21 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 now that inclusive phylogenetic trees have warrant co-authorship. been published for these genera (Glienke et al. On a related note, taxonomists should 2011; Maharachchikumbura et al. 2011; take the time to revisit their sequences in INSD Cannon et al. 2012; Udayanga et al. 2012) to make sure their annotations and metadata are using types and epitypes, it is possible to name up-to-date and as complete as possible. At some of the endophytic isolates in these genera present this does not seem to be the case; with reasonable accuracy. Some of the modern Tedersoo et al. (2011) showed, for instance, monographs with molecular data and that a modest 43% of the public fungal ITS compilations of different genera and families sequences were annotated with the country of (e.g., Lombard et al. 2010; Bensch et al. 2010; origin. We believe most researchers would Hirooka et al. 2012) have become excellent agree that the purpose of sequence databases is guides for aspiring researchers with the to be a meaningful resource to science. The knowledge on multiple taxonomic disciplines. databases will however never be a truly The relevance of taxonomy will therefore be meaningful resource to science if all data upgraded as long as care is taken to provide the contributors keep postponing the updates of sequence databases with accurately named, and their entries to a day when free time to do those richly annotated, sequences. updates, magically, becomes available. Taxonomists should also include Keeping ones public sequences up-to-date must aspects of, e.g., ecology and geography in be made a prioritized undertaking; taxonomy taxonomic papers and seek to publish them should be a discipline that shares its progress where they are seen by others. We feel that a with the rest of the world in a timely fashion. If good taxonomic study should draw from and when generating ITS sequences, fungal molecular data (as applicable) and that it taxonomists should try to meet the formal should attempt to relate the newly generated barcoding criteria sequence data to the data already available in (http://barcoding.si.edu/pdf/dwg_data_standard the public sequence databases and in the s-final.pdf; Schoch et al. 2012). This lends literature. When interesting patterns and extra credibility to their work and is, in turn, connections are found, the modern taxonomist likely to lead to a wider dissemination of their should pursue them, including writing to the results. Taxonomists should similarly take the authors of those sequences to see if there is in opportunity to add further value to their studies fact an extra dimension to their data. As fellow by including additional illustrations and other taxonomists, we should all strive to help each data in their publications. Space normally other in maximizing the output of any given comes at a premium in scientific publications, dataset. Taxonomists have a reputation of but most journals would be glad to bundle working alone – or at best in small, closed highlights of additional interesting biological groups – and judging by the last few years’ properties and additional figures of, e.g., worth of taxonomic publications, that fruiting bodies, habitats, and collection sites as reputation is at least partially true. It would do supplementary, online-only items (cf. Seifert taxonomy well if this practise were to stop. We and Rossman 2010). envisage even arch-taxonomical papers such as Promoting ones results and career descriptions of new species as opportunities to advancement – Even if the authors’ new involve and integrate, e.g., ecologists and/or scientific paper is excellent, the chances are bioinformaticians into taxonomic projects. that it will not stand out above the background There are only benefits to inviting (motivated) noise to get the attention it deserves – at least researchers to ones studies: their focal aspects not at a speed the author deems rapid enough. (such as ecology) will be better handled in the The author may therefore want to do some PR manuscript compared to if the author had for their article. One step in that direction is to handled them on their own, and the new co- cite their paper whenever appropriate. Another authors are furthermore likely to improve the step is to send polite, non-invasive emails to general quality of the manuscript by bringing researchers that the author feels should know an outside perspective and experience. We feel about the article – as long as the author that perhaps 1-2 days of work is sufficient to provides a URL to the article rather than

22 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 attaching a sizable PDF file, nobody is likely to one’s own, using nothing but the literature as take significant offence from such emails if guidance, is a recipe for failure. To find people sent in moderation. A third step would be to committed to helping you may be difficult, present the results at the next relevant however, and we advocate that anyone who conference. In fact, the authors could send a contributes 1-2 days or more of their time poster and article reprints to conferences they towards your goals should be invited to are not able to attend through helpful co- collaborate in the study. Newcomers to the authors or colleagues. On a more general level, molecular field have the opportunity to learn conferences form the perfect arena where to from the experience – and past mistakes – of meet people and forge connections. We others. In this way they will get most things sincerely hope that all Ph.D. students in correct right from the start, and in this paper mycology will have the opportunity to attend at we have highlighted what we feel are best least 2-3 conferences (international ones if at practises in the field. We believe that all possible). Poster sessions and “Ph.D. taxonomists should adhere to best practises and mixers” are particularly relaxed events where to maximise the scientific potential and general friends and future collaborators can be usefulness of their data, because taxonomy is a established. Even highly distinguished, discipline that is central to biology and that prominent mycologists tend to be open to faces multiple far-reaching, but also promising, questions and discussion, and many of them challenges. It is however impossible to give take particular care to talk to Ph.D. students advice that would hold true in all situations and and other emerging talents. We acknowledge, throughout time, and we hope that the reader however, that not everyone is in a position to will use our material as a realization rather than go to conferences. Fortunately we live in a as a binding recipe. digital world where friendly, polite emails can get you a long way. Indeed, several of the Acknowledgements present authors have made their most valuable RHN acknowledges financial support scientific connections through email only. from the Swedish Research Council of Similarly, to sign up in, e.g., Google Scholar to Environment, Agricultural Sciences, and receive email alerts when ones papers are cited Spatial Planning (FORMAS, 215-2011-498) is a good way to scout for researchers with and from the Carl Stenholm Foundation. DU similar research interests. We believe that the and DSM thank Mushroom Research user should make an effort to connect to others, Foundation, Thailand for postgraduate because it is though others that opportunities scholarship. The North European Forest present themselves. Furthermore, if one Mycologists network is acknowledged for chooses those “others” with some care, the ride infrastructural support. Bengt Oxelman is will be a whole lot more enjoyable. supported by a grant (2012-3719) from the Swedish Research Council. Manfred Binder, Concluding remarks Terri Porter, Bo Jarneving, and Rob Lanfear The present publication seeks to lower are acknowledged for valuable discussions and the learning threshold for using molecular comments. Support from the Gothenburg methods in fungal systematics. Each header Bioinformatics Network is gratefully deserves its own review paper, but we have acknowledged. instead provided a topic overview and the references needed to commence molecular References studies. We do not wish to oversimplify molecular mycology; at the same time, we do Abang MM, Abraham WR, Asiedu R, not wish to depict it as an overly complex Hoffmann P, Wolf G, Winter S. 2009 – undertaking, because in most cases it is not. Secondary metabolite profile and Our take-home message is that the best way to phytotoxic activity of genetically learn molecular methods and know-how is distinct forms of Colletotrichum through someone who already knows them gloeosporioides from yam (Dioscorea well. To attempt to get everything working on spp.). Mycological Research 113, 130–

23 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 140. 964–977. Abarenkov K, Nilsson RH, Larsson KH et al. Bergsten J. 2005 – A review of long-branch 2010a – The UNITE database for attraction. 21, 163–193. molecular identification of fungi – Berney C, Fahrni J, Pawlowski J. 2004 – How recent updates and future perspectives. many novel eukaryotic 'kingdoms'? New Phytologist 186, 281–285. Pitfalls and limitations of Abarenkov K, Tedersoo L, Nilsson RH et al. environmental DNA surveys. BMC 2010b – PlutoF – a web-based Biology 2, 13. workbench for ecological and Bidartondo M, Bruns TD, Blackwell M et al. taxonomical research, with an online 2008 – Preserving accuracy in implementation for fungal ITS GenBank. Science 319, 1616. sequences. Evolutionary Bioinformatics Blaxter M, Mann J, Chapman T, Thomas F, 6, 189–196. Whitton C, Floyd R, Abebe E. 2005 – Abd-Elsalam KA, Yassin MA, Moslem MA, Defining operational taxonomic units Bahkali AH, de Wit PJGM, McKenzie using DNA barcode data. Philosophical EHC, Stephenson SL, Cai L, Hyde KD. Transactions of the Royal Society of 2010 – Culture collections, the new London Series B – Biological Sciences herbaria for fungal pathogens. Fungal 360, 1935–1943. Diversity 45, 21–32. Bonito GM, Gryganskyi AP, Trappe JM, Agnarsson I, Kuntner M. 2007 – Taxonomy in Vilgalys R. 2010 – A global meta- a changing world: seeking solutions for analysis of Tuber ITS rDNA sequences, a science in crisis. Systematic Biology species diversity, host associations and 56, 531–539. long-distance dispersal. Molecular Altschul SF, Madden TL, Schäffer AA, Zhang Ecology 19, 4994–5008. J, Zhang Z, Miller W, Lipman DJ. 1997 Bonito G, Smith ME, Nowak M, Healy RA, – Gapped BLAST and PSI-BLAST: a Guevara G et al. 2013 – Historical new generation of protein database and diversification of search programs. Nucleic Acids truffles in the Tuberaceae and their Research 25, 3389–3402. newly identified Southern Hemisphere Amid C, Birney E, Bower L et al. 2012 – sister lineage. PLoS ONE 8, e52765. Major submissions tool developments Bryant D, Moulton V. 2004 – NeighborNet, an at the European nucleotide archive. agglomerative algorithm for the Nucleic Acids Research 40(D1), D43– construction of planar phylogenetic D47. networks. Molecular Biology and Begerow D, Nilsson RH, Unterseher M, Maier Evolution 21, 255–265. W. 2010 – Current state and Bärlocher F, Charette N, Letourneau A, perspectives of fungal DNA barcoding Nikolcheva LG, Sridhar KR. 2010 – and rapid identification procedures. Sequencing DNA extracted from single Applied Microbiology and conidia of aquatic hyphomycetes. Biotechnology 87, 99–108. Fungal Ecology 3, 115–121. Bensch K, Groenewald JZ, Dijksterhuis J et al. Caffarella RS, Barnett BG. 2000 – Teaching 2010 – Species and ecological diversity doctoral students to become scholarly within the Cladosporium cladospo- writers, the importance of giving and rioides complex (Davidiellaceae, receiving criticism. Studies in Higher Capnodiales). Studies in Mycology 67, Education 25, 39–52. 1–96. Cai L, Hyde KD, Taylor PWJ et al. 2009 – A Berbee ML, Pirseyedi M, Hubbard S. 1999 – polyphasic approach for studying Cochliobolus phylogenetics and the Colletotrichum. Fungal Diversity 39, origin of known, highly virulent 183–204. pathogens, inferred from ITS and Cai L, Udayanga D, Manamgoda DS, glyceraldehyde-3-phosphate dehydro- Maharachchikumbura SSN, Liu XZ, genase gene sequences. Mycologia 91, Hyde KD. 2011 – The need to carry out

24 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 re-inventory of tropical plant Bioinformatics 27, 1164–1165. pathogens. Tropical Plant Pathology 36, DeBertoldi M, Lepidi AA, Nuti MP. 1973 – 205–213. Significance of DNA base composition Cannon PF, Damm U, Johnston PR, Weir BS. in classification of Humicola and 2012 – Colletotrichum – current status related genera. Transactions of the and future directions. Studies in British Mycological Society 60, 77–85. Mycology 73, 181–213. Doyle JJ. 1992 – Gene trees and species trees, Capella-Gutierrez S, Silla-Martinez JM, molecular systematics as one-character Gabaldon T. 2009 – trimAl, a tool for taxonomy. Systematic 17, 144– automated alignment trimming in large- 163. scale phylogenetic analyses. Drew LW. 2011 – Are we losing the science of Bioinformatics 25, 1972–1973. taxonomy? BioScience 61, 942–946. Carbone I, Kohn L. 1999 – A method for Drummond AJ, Suchard MA, Xie D, Rambaut designing primer sets for speciation A. 2012 – Bayesian phylogenetics with studies in filamentous ascomycetes. BEAUti and the BEAST 1.7. Molecular Mycologia 91, 553–556. Biology and Evolution 29, 1969–1973. Choi YW, Hyde KD, Ho WH. 1999 – Single Ebersberger I, de Matos Simoes R, Kupczok A, spore isolation of fungi. Fungal Gube M, Kothe E, Voigt K, von Diversity 3, 29–38. Haeseler A. 2012 – A consistent Chomnunti P, Schoch CL, Aguirre-Hudson B, phylogenetic backbone for fungi. Ko-Ko TW, Hongsanan S, Jones EBG, Molecular Biology and Evolution 29, Kodsueb R, Phookamsak R, 1319–1334. Chukeatirote E, Bahkali AH, Hyde KD. Edgar RC. 2004 – MUSCLE, multiple 2011 – Capnodiaceae. Fungal Diversity sequence alignment with high accuracy 51, 103–134. and high throughput. Nucleic Acids Conn VS, Isaramalai S, Rath S, Jantarakupt P, Research 32, 1792–1797. Wadhawan R, Dash Y. 2003 – Beyond Edwards SV. 2009 – Is a new and general MEDLINE for literature searches. theory of molecular systematics Journal of Nursing Scholarship 35, emerging? Evolution 63, 1–19. 177–182. Farris JS, Albert VA, Källersjö M, Lipscomb Cooper A, Poinar HN. 2000 – Ancient DNA, D, Kluge AG. 1996 – Parsimony do it right or not at all. Science 289, jackknifing outperforms neighbor- 1139. joining. Cladistics 12, 99–124. Cornish-Bowden A. 1985 – Nomenclature for Feibelman T, Bayman P, Cibula WG. 1994 – incompletely specified bases in nucleic Length variation in the internal acid sequences, recommendations 1984. transcribed spacer of ribosomal DNA in Nucleic Acids Research 13, 3021– chanterelles. Mycological Research 98, 3030. 614–618. Correll JC, Guerber JC, Wasilwa LA, Sherrill Felsenstein J. 1985 – Confidence limits on JF, Morelock TE. 2000 – Inter- and phylogenies – an approach using the Intraspecific variation in bootstrap. Evolution 39, 783–791. Colletotrichum and mechanisms which Felsenstein J. 2004 – Inferring phylogenies. affect population structure In, Sinauer Press. Sunderland, Mass. Colletotrichum, host specificity, Ficetola GF, Coissac E, Zundel S, Riaz T, pathology, and host pathogen Shehzad W, Bessière J, Taberlet P, interaction (eds. D. Prusky, S.Fungal Pompanon F. 2010 – An in silico Diversity 201 Freeman and M.B. approach for the evaluation of DNA Dickman). APS Press, St. Paul, barcodes. BMC Genomics 11, 434. Minnesota, USA. Fredricks DN, Smith C, Meier A. 2005 – Darriba D, Taboada GL, Doallo R, Posada D. Comparison of six DNA extraction 2011 – ProtTest 3, fast selection of methods for recovery of fungal DNA as best-fit models of protein evolution. assessed by quantitative PCR. Journal

25 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 of Clinical Microbiology 43, 5122– TNT, a free program for phylogenetic 5128. analysis. Cladistics 24, 774–786. Frisvad, JC, Larsen TO, de Vries RP, Meijer Gonnet GH, Korostensky C, Benner S. 2000 – M, Houbraken J, Samson RA, Javier Evaluation measures of multiple Cabañes F, Ehrlich K. 2007 –. sequence alignments. Journal of Secondary metabolite profiling, growth 7, 261–276. profiles and other tools for species Gouy M, Guindon S, Gascuel O. 2010 – recognition and important Aspergillus SeaView Version 4, A multiplatform mycotoxins. Studies in Mycology 59, graphical user interface for sequence 31–37. alignment and phylogenetic tree Frisvad JC, Andersen B, Thrane U. 2008 – The building. Molecular Biology and use of secondary metabolite profiling in Evolution 27, 221–224. of filamentous fungi. Greenberg J. 2009 – Theoretical considerations Mycological Research 112, 231–240. of lifecycle modeling, an analysis of the Fujita MK, Leaché AD, Burbrink FT, McGuire Dryad repository demonstrating JA, Moritz C. 2012 – Coalescent-based automatic metadata propagation, species delimitation in an integrative inheritance, and value system adoption. taxonomy. Trends in Ecology & Cataloging and Classification Quarterly Evolution 27, 480–488. 47, 380–402. Gardes M, Bruns TD. 1993 – ITS primers with Guarro J, Gene J, Stchigel AM. 1999 – enhanced specificity for basidiomycetes Developments in fungal taxonomy. – application to the identification of Clinical Microbiology Reviews 12, mycorrhizas and rusts. Molecular 454–500. Ecology 2, 113–118. Guindon S, Dufayard JF, Gascuel O. 2009 – Gargas A, DePriest PT. 1996 – A nomenclature Estimating maximum likelihood for fungal PCR primers with examples phylogenies with PhyML. from intron-containing SSU rDNA. Bioinformatics for DNA Sequence Mycologia 88, 745–748. Analysis, Methods in Molecular Gazis R, Rhener S, Chaverri P. 2011 – Species Biology 537, 113–137. delimitation in fungal endophyte Hanke M, Wink M. 1994 – Direct DNA diversity studies and its implications in sequencing of PCR-amplified vector ecological and biogeographic inserts following enzymatic degradation inferences. Molecular Ecology 30, of primer and dNTPs. Biotechniques 3001–3013. 17, 858–860. Ghobad-Nejhad M, Nilsson RH, Hallenberg N. Healy RA, Smith ME, Bonito GM et al. 2013 – 2010 – Phylogeny and taxonomy of the High diversity and widespread genus Vuilleminia (Basidiomycota) occurrence of mitotic spore mats in based on molecular and morphological ectomycorrhizal Pezizales. Molecular evidence, with new insights into the Ecology 22, 1717–1732. Corticiales. TAXON 59, 1519–1534. Heled J, Drummond AJ. 2010 – Bayesian Glass NL, Donaldson GC. 1995 – inference of species trees from Development of primer sets designed multilocus data. Molecular Biology and for use with the PCR to amplify Evolution 27, 570–580. conserved genes from filamentous Hibbett DS. 2007 – After the gold rush, or ascomycetes. Applied and Environ- before the flood? Evolutionary mental Microbiology 61, 1323–1330. morphology of mushroom-forming Glienke C, Pereira OL, Stringari D et al. 2011– fungi () in the early 21st Endophytic and pathogenic Phyllosticta century. Mycological Research 111, species, with reference to those 1001–1018. associated with Citrus Black Spot. Hibbett DS, Ohman A, Glotzer D, Nuhn M, Persoonia 26, 47–56. Kirk P, Nilsson RH. 2011 – Progress in Goloboff PA, Farris JS, Nixon KC. 2008 – molecular and morphological taxon

26 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 discovery in Fungi and options for Hyde KD, Zhang Y. 2008 – Epitypification, formal classification of environmental should we epitypify? Journal of sequences. Fungal Biology Reviews 25, Zhejiang University 9, 842–846. 38–47. Hyde KD, Cai L, McKenzie EHC, Yang YL, Hillis DM, Moritz C, Mable BK. 1996 – Zhang JZ, Prihastuti H. 2009 – Molecular Systematics, 2nd Ed., Colletotrichum, a catalogue of Sinauer Associates, Sunderland, MA. confusion. Fungal Diversity 39, 1–17. Hipp LA, Hall JC, Sytsma KJ. 2004 – Hyde KD, McKenzie EHC, KoKo TW. 2010 – Congruence versus phylogenetic Anamorphic fungi in a natural accuracy, revisiting the incongruence classification – checklist and notes for length difference test. Systematic 2010. Mycosphere 2, 1–88. Biology 53, 81–89. Hyde KD, KoKo TW. 2011 – Where to publish Hirooka Y, Rossman AY, Samuels GJ, Lechat in mycology? Current Research in C, Chaverri P. 2012 – A monograph of Environmental and Applied Mycology Allantonectria, Nectria, and 1, 27–52. Pleonectria (Nectriaceae, Hypocreales, Ihrmark K, Bödeker ITM, Cruz-Martinez K et ) and their pycnidial, al. 2012 – New primers to amplify the sporodochial, and synnematous fungal ITS2 region – evaluation by anamorphs. Studies in Mycology 71, 1– 454-sequencing of artificial and natural 210. communities. FEMS Microbiology Hirt RP Logsdon JM, Healy Jr. B, Dorey MW, Ecology 82, 666–677. Doolittle WF, Embley TM. 1999 – James TY, Kauff F, Schoch CL et al. 2006 – Microsporidia are related to fungi, Reconstructing the early evolution of evidence from the largest subunit of Fungi using a six-gene phylogeny. RNA polymerase II and other . Nature 443, 818–822. Proceedings of the National Academy Jones G, Sagitov S, Oxelman B. 2013 – of Sciences USA 96, 580–585. Statistical inference of allopolyploid Horn WS, Simmonds MSJ, Schwartz RE, species networks in the presence of Blaney WM. 1996 – Variation in incomplete lineage sorting. Systematic production of phomodiol and Biology, doi:10.1093/sysbio/syt012. phomopsolide B by Phomopsis spp. Kanagawa T. 2003 – Bias and artifacts in Mycologia 88, 588–595 multitemplate polymerase chain Huber KT, Oxelman B, Lott M, Moulton V. reactions (PCR). Journal of Bioscience 2006 – Reconstructing the evolutionary and Bioengineering 96, 317–323. history of polyploids from multi- Kang S, Mansfield MAM, Park B et al. 2010 – labeled trees. Molecular Biology and The promise and pitfalls of sequence- Evolution 23, 1784–1791. based identification of plant pathogenic Hubka V, Kolarik M. 2012 – β-tubulin fungi and oomycetes. Phytopathology paralogue tubC is frequently 100, 732–737. misidentified as the benA gene in Karakousis A, Tan L, Ellis D, Alexiou H, Aspergillus section Nigri taxonomy, Wormald PJ. 2006 – An assessment of primer specificity testing and the efficiency of fungal DNA extraction taxonomic consequences. Persoonia 29, methods for maximizing the detection 1–10. of medically important fungi using Huson DH, Bryant D. 2006 – Application of PCR. Journal of Microbiological phylogenetic networks in evolutionary Methods 65, 38–48. studies. Molecular Biology and Karsch-Mizrachi I, Nakamura Y, Cochrane G. Evolution 23, 254–267. 2012 – The International Nucleotide Huson DH, Rupp R, Scornavacca C. 2011 – Collaboration. Phylogenetic networks, concepts, Nucleic Acids Research 40(D1), D33– algorithms and applications. Cambridge D37. University Press, UK. Katoh K, Toh H. 2010 – Parallelization of the

27 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 MAFFT multiple sequence alignment Leavitt SD, Johnson LA, Goward T, St Clair program. Bioinformatics 26, 1899– LL. 2011 – Species delimitation in 1900. taxonomically difficult lichen-forming Kauserud H, Shalchian-Tabrizi K, Decock C. fungi, an example from 2007 – Multi-locus sequencing reveals morphologically and chemically diverse multiple geographically structured Xanthoparmelia (Parmeliaceae) in lineages of Coniophora arida and C. North America. Molecular olivacea (Boletales) in North America. Phylogenetics and Evolution 60, 317– Mycologia 99, 705–713. 332. Kloepper TH, Huson DH. 2008 – Drawing Leebens-Mack J, Vision T, Brenner E et al. explicit phylogenetic networks and 2006 – Taking the first steps towards a their integration into SplitsTree. BMC standard for reporting on phylogenies, Evolutionary Biology 8, 22. Minimum Information about a Ko Ko TW, Stephenson SL, Bahkali AH, Hyde Phylogenetic Analysis (MIAPA). The KD. 2011 – From morphology to OMICS Journal 10, 231–237. molecular biology, can we use Letunic I, Bork P. 2011 – Interactive Tree Of sequence data to identify fungal Life v2, online annotation and display endophytes? Fungal Diversity 50, 113– of phylogenetic trees made easy. 120. Nucleic Acids Research 39(S2),W475– Korf RP. 2005 – Reinventing taxonomy, a W478. curmudgeon's view of 250 years of Lindahl BD, Nilsson RH, Tedersoo L et al. fungal taxonomy, the crisis in 2013 – Fungal community analysis by biodiversity, and the pitfalls of the high-throughput sequencing of phylogenetic age. Mycotaxon 93, 407– amplified markers – a user’s guide. 415. New Phytologist, in press. DOI, Krell F-T, Cranston PS. 2004 – Which side of 10.1111/nph.12243 the tree is more basal? Systematic Liu L, Pearl D. 2007 – Species trees from gene Entomology 29, 279–281. trees, reconstructing Bayesian posterior Kubatko LS. 2009 – Identifying hybridization distributions of a species phylogeny events in the presence of coalescence using estimated gene tree distributions. via model selection. Systematic Systematic Biology 56, 504–514. Biology 58, 478–488. Liu K, Raghavan S, Nelesen S, Linder CR, Lanfear R, Calcott B, Ho SYW, Guindon S. Warnow T. 2009 – Rapid and accurate 2012 – PartitionFinder, combined large-scale coestimation of sequence selection of partitioning schemes and alignments and phylogenetic trees. substitution models for phylogenetic Science 324, 1561–1564. analyses. Molecular Biology and Liu Y, Whelen JS, Hall BD. 1999 – Evolution 29, 1695–1701. Phylogenetic relationships among Larsson E, Jacobsson S. 2004 – Controversy ascomycetes, evidence from an RNA over Hygrophorus cossus settled using polymerase II subunit. Molecular ITS sequence data from 200 year-old Biology and Evolution 16, 1799–1808. type material. Mycological Research Lombard L, Crous PW, Wingfield BD, 108, 781–786. Wingfield MJ. 2010 – Systematics of Lartillot N, Lepage T, Blanquart S. 2009 – Calonectria, a genus of root, shoot and PhyloBayes 3, a Bayesian software foliar pathogens. Studies in Mycology package for phylogenetic reconstruction 66, 1–71. and molecular dating. Bioinformatics Löytynoja A, Goldman N. 2005 – An 25, 2286–2288. algorithm for progressive multiple Lassmann T, Sonnhammer ELL. 2005 – alignment of sequences with insertions. Automatic assessment of alignment Proceeedings of the National Academy quality. Nucleic Acids Research 33, of Sciences USA 102, 10557–10562. 7120–7128. MacCallum CJ, Parthasarathy H. 2006 – Open

28 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 access increases citation rate. PLoS 2010 – Improving molecular detection Biology 4, 176. of fungal DNA in formalin-fixed Maddison WP, Maddison DR. 2011 – paraffin-embedded tissues, comparison Mesquite, a modular system for of five tissue DNA extraction methods evolutionary analysis. Version 2.75. using panfungal PCR. Journal of http,//mesquiteproject.org Clinical Microbiology 48, 2471–2153. Maddison DR, Swofford DL, Maddison WP. Möhlenhoff P, Müller L, Gorbushina AA, 1997 – NEXUS, an extensible file Petersen K. 2001 – Molecular approach format for systematic information. to the characterisation of fungal Systematic Biology 46, 590–621. communities, methods for DNA Maeta K, Ochi T, Tokimoto K et al. 2008 – extraction, PCR amplification and Rapid species identification of cooked DGGE analysis of painted art objects. poisonous mushrooms by using Real- FEMS Microbiology Letters 195, 169– Time PCR. Applied and Environmental 173. Microbiology 74, 3306–3309. Nilsson RH, Ryberg M, Kristiansson E, Maharachchikumbura SSN, Guo LD, Abarenkov K, Larsson K-H, Kõljalg U. Chukeatirote E, Bhhkali AH, Hyde KD. 2006 – Taxonomic reliability of DNA 2011 – Pestalotiopsis – morphology, sequences in public sequence databases, phylogeny, biochemistry and diversity. a fungal perspective. PLoS ONE 1, e59. Fungal Diversity 50, 167–187. Nilsson RH, Ryberg M, Sjökvist E, Abarenkov Manamgoda DS, Cai L, McKenzie EHC et al. K. 2011 – Rethinking taxon sampling in 2012 – A phylogenetic and taxonomic the light of environmental sequencing. re-evaluation of the Bipolaris – Cladistics 27, 197–203. Cochliobolus – Curvularia complex. Nilsson RH, Tedersoo L, Abarenkov K et al. Fungal Diversity 56, 131–144. 2012 – Five simple guidelines for McNeill J, Barrie FR, Buck WR et al., eds. establishing basic authenticity and 2012 – International Code of reliability of newly generated fungal Nomenclature for algae, fungi, and ITS sequences. MycoKeys 4, 37–63. plants (Melbourne Code), Adopted by Nylander JAA, Wilgenbusch JC, Warren DL, the Eighteenth International Botanical Swofford DL. 2008 – AWTY (are we Congress Melbourne, Australia, July there yet?), a system for graphical 2011 (electronic ed.), Bratislava, exploration of MCMC convergence in International Association for Plant Bayesian phylogenetics. Bioinformatics Taxonomy. 24, 581–583. Mesirov JP. 2010 – Accessible reproducible O'Donnell K, Lutzoni FM, Ward TJ, Benny research. Science 327, 415–416. GL. 2001 – Evolutionary relationships Miller CW, Chabot MD, Messina TC. 2009 – among mucoralean fungi (Zygomycota), A student’s guide to searching the evidence for family on a literature using online databases. large scale. Mycologia 93, 286–296. American Journal of Physics 77, 1112– O'Meara BC. 2010 – New heuristic methods 1117. for joint species delimitation and Miller MJ, Powell JI. 1994 – A quantitative species tree inference. Systematic comparison of DNA sequence assembly Biology 59, 59–73. programs. Journal of Computational Pamilo P, Nei M. 1988 – Relationships Biology 1, 257–269. between gene trees and species trees. Monaghan MT, Wild R, Elliot M et al. 2009 – Molecular Biology and Evolution 5, Accelerated species inventory on 568–583. Madagascar using coalescent-based Pearson WR, Lipman DJ. 1988 – Improved models of species delineation. tools for biological sequence Systematic Biology 58, 298–311. comparison. Proceedings of the Muñoz-Cadavid C, Rudd S, Zaki SR, Patel M, National Academy of Sciences USA Moser SA, Brandt ME, Gómez BL. 85, 2444–2448.

29 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 Porter TM, Skillman JE, Moncalvo J-M. 2008 methodologies used for assessing – Fruiting body and soil rDNA fungal diversity via ITS sequencing. sampling detects complementary Journal of Environmental Monitoring assemblage of Agaricomycotina 14, 766–774. (Basidiomycota, Fungi) in a hemlock- Robert V, Szöke S, Eberhardt U, Cardinali G, dominated forest plot in southern Meyer W, Seifert KA, Levesque CA, Ontario. Molecular Ecology 17, 3037– Lewis CT. 2011 – The quest for a 3050. general and reliable fungal DNA Porter TM, Golding GB. 2012 – Factors that barcode. The Open Applied Informatics affect large subunit ribosomal DNA Journal 5, 45–61. amplicon sequencing studies of fungal Ronquist F, Huelsenbeck JP. 2003 – MrBayes communities, classification method, 3, Bayesian phylogenetic inference primer choice, and error. PLoS ONE under mixed models. Bioinformatics 7,e35749. 19, 1572–1574. Posada D. 2008 – jModelTest, phylogenetic Ronquist F, Teslenko M, van der Mark P et al. model averaging. Molecular Biology 2012 – MrBayes 3.2: efficient Bayesian and Evolution 25, 1253–1256. phylogenetic inference and model Price MN, Dehal PS, Arkin AP. 2010 – choice across a large model space. FastTree 2 – Approximately maximum- Systematic Biology 61, 539–542. likelihood trees for large alignments. Rosindell J, Harmon LJ. 2012 – OneZoom, A PLoS ONE 5, e9490. fractal explorer for the Tree of Life. Qiu X, Wu L, Huang H, McDonel PE, PLoS Biology 10, e1001406. Palumbo AV, Tiedje JM, Zhou J. 2001 Rossman AY, Palm-Hernández ME. 2008 – – Evaluation of PCR-generated Systematics of plant pathogenic fungi, chimeras, mutations, and heterodu- why it matters. Plant Disease 92, 1376– plexes with 16S rRNA gene-based 1386. cloning. Applied and Environmental Rozen S, Skaletsky H. 1999 – Primer3 on the Microbiology 67, 880–887. WWW for general users and for Rademaker JLW, Louws FJ, de Bruijn FJ. biologist programmers. Methods in 1998 – Characterization of the diversity Molecular Biology 132, 365–386. of ecologically important microbes by Ryberg M, Nilsson RH, Kristiansson E, Topel rep-PCR genomic fingerprinting, p. 1- M, Jacobsson S, Larsson E. 2008 – 27. In Akkermans ADL, van Elsas JD, Mining metadata from unidentified ITS and de Bruijn FJ (ed.), Molecular sequences in GenBank, a case study in microbial ecology manual, v3.4.3. Inocybe (Basidiomycota). BMC Kluwer Academic Publishers, Evolutionary Biology 8, 50. Dordrecht, The Netherlands. Ryberg M, Kristiansson E, Sjökvist E, Nilsson Raja H, Schoch CL, Hustad V, Shearer C, RH. 2009 – An outlook on the fungal Miller A. 2011 – Testing the internal transcribed spacer sequences in phylogenetic utility of MCM7 in the GenBank and the introduction of a Ascomycota. MycoKeys 1, 63–94. web-based tool for the exploration of Rambaut A, Drummond AJ. 2007 – Tracer fungal diversity. New Phytologist 181, v1.4, Available from 471–477. http,//beast.bio.ed.ac.uk/Tracer Sagova-Mareckova M, Cermak L, Novotna J, Rannala B, Yang Z. 2003 – Bayes estimation Plhackova K, Forstova J, Kopecky J. of species divergence times and 2008 – Innovative methods for soil ancestral population sizes using DNA DNA purification tested in soils with sequences from multiple loci. Genetics widely differing characteristics. 164, 1645–1656. Applied and Environmental Rittenour WR, Park J-H, Cox-Ganser JM, Microbiology 74, 2902–2907. Beezhold DH, Green BJ. 2012 – Samarakoon T, Wang SY, Alford MH. 2013 – Comparison of DNA extraction Enhancing PCR amplification of DNA

30 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 from recalcitrant plant specimens using phylogenetic analyses. Systematic a trehalose-based additive. Applications Biology 49, 369–381. in Plant Sciences 1, 1200236. Singh VK, Kuamar A. 2001 – PCR primer Samson RA, Varga J. 2007 – Aspergillus design. Molecular Biology Today 2, systematics in the genomic era. Studies 27–32. in Mycology 59, 1–206. Specht CA, Dirusso CC, Novotny CP, Ullrich Sanderson MJ, Donoghue MJ, Piel W, RC. 1982 – A method for extracting Eriksson T. 1994 – TreeBASE, a high-molecular-weight deoxyribo- prototype database of phylogenetic nucleic acid from fungi. Analytical analyses and an interactive tool for Biochemistry 119, 158–163. browsing the phylogeny of life. Stamatakis A. 2006 – RAxML-VI-HPC, American Journal of Botany 81, 183. Maximum likelihood-based Santos JM, Correia VG, Phillips AJL. 2010 – phylogenetic analyses with thousands Primers for mating-type diagnosis in of taxa and mixed models. Diaporthe and Phomopsis, their use in Bioinformatics 22, 2688–2690. teleomorph induction in vitro and Suchard MA, Redelings BD. 2006 – BAli-Phy, biological species definition. Fungal simultaneous Bayesian inference of Biology 114, 255–270. alignment and phylogeny. Schickmann S, Kräutler K, Kohl G, Nopp- Bioinformatics 22, 2047–2048. Mayr U, Krisai-Greilhuber I, Swofford DL. 2002 – PAUP 4.0b10, Hackländer K, Urban A. 2011 – phylogenetic analysis using parsimony. Comparison of extraction methods Sinauer Associates, Sunderland applicable to fungal spores in faecal Särkinen T, Staats M, Richardson JE, Cowan samples from small mammals. Sydowia RS, Bakker FT. 2012 – How to open 63, 237–247. the treasure chest? Optimising DNA Schmitt I, Crespo A, Divakar PK et al. 2009 – extraction from herbarium specimens. New primers for promising single copy PLoS ONE 7, e43808. genes in fungal phylogenetics and Talavera G, Castresana J. 2007 – Improvement systematic. Persoonia 23, 35–40. of phylogenies after removing Schoch CL, Seifert KA, Huhndorf A et al. divergent and ambiguously aligned 2012 – Nuclear ribosomal internal blocks from protein sequence transcribed spacer (ITS) region as a alignments. Systematic Biology 56, universal DNA barcode marker for 564–577. Fungi. Proceedings of the National Taylor DL, McCormick MK. 2008 – Internal Academy of Sciences USA 109, 6241– transcribed spacer primers and sequen- 6246. ces for improved characterization of Seifert K, Rossman AY. 2010 – How to basidiomycetous orchid mycorrhizas. describe a new fungal species. IMA New Phytologist 177, 1020–1033. Fungus 1, 109–116. Taylor JW, Jacobson DJ, Kroken S, Kasuga T, Shenoy BD, Jeewon R, Hyde KD. 2007 – Geiser DM, Hibbett DS, Fisher MC. Impact of DNA sequence-data on the 2000 – Phylogenetic species recogni- taxonomy of anamorphic fungi. Fungal tion and species concepts in fungi. Diversity 26, 1–54. Fungal Genetics and Biology 31, 21– Silva DN, Talhinas P, Várzea V, Cai L, Paulo 32. OS, Batista D. 2012 – Application of Tedersoo L, Abarenkov K, Nilsson RH et al. the Apn2/MAT locus to improve the 2011 – Tidying up International systematics of the Colletotrichum Nucleotide Sequence Databases, gloeosporioides complex, an example ecological, geographical, and sequence from coffee (Coffea spp.) hosts. quality annotation of ITS sequences of Mycologia 104, 396–409. mycorrhizal fungi. PLoS ONE 6, Simmons MP, Ochoterena H. 2000 – Gaps as e24940. characters in sequence-based Thon MR, Royse DJ. 1999 – Partial ß-tubulin

31 Current Research in Environmental & Applied Mycology Doi 10.5943/cream/3/1/1 gene sequences for evolutionary studies markers for fungal phylogenetics, two in the Basidiomycotina. Mycologia 91, genes for species level systematics in 468–474. the Sordariomycetes (Ascomycota). Toju H, Tanabe AS, Yamamoto S, Sato H. Molecular Phylogenetics and Evolution 2012 – High-coverage ITS primers for 64, 500–512. the DNA-based identification of Weir BS, Johnston PR, Damm U. 2012 – The ascomycetes and basidiomycetes in Colletotrichum gloeosporioides species environmental samples. PLoS ONE 7, complex. Studies in Mycology 73, 115– e40863. 180. de la Torre-Bárcena JE, Kolokotronis S-O, Lee White TJ, Bruns T, Lee S, Taylor JW. 1990 – EK et al. 2009 – The impact of Amplification and direct sequencing of outgroup choice and missing data on fungal ribosomal RNA genes for major seed plant phylogenetics using phylogenetics. Pp. 315-322 In, PCR genome-wide EST data. PLoS ONE 4, Protocols, A Guide to Methods and e5764. Applications, eds. Innis MA, Gelfand Torrence MS, Thomas GV, Robinson EJ. 1994 DH, Sninsky JJ, and White TJ. – The writing strategies of graduate Academic Press, Inc., New York. research students in the social sciences. Wilson IG. 1997 – Inhibition and facilitation of Higher Education 27, 379–392. nucleic acid amplification. Applied and Udayanga D, Liu X, Crous PW, McKenzie Environmental Microbiology 63, 3741– EHC, Chukeatirote E, Hyde KD. 2012 3751. – A multi-locus phylogenetic Yang ZL. 2011 – Molecular techniques evaluation of Diaporthe (Phomopsis). revolutionize knowledge of basidio- Fungal Diversity 56, 157–171. mycete evolution. Fungal Diversity 50, Varón A, Vinh LS, Wheeler WC. 2009 – POY 47–58. version 4, phylogenetic analysis using Yang Z, Rannala B. 2012 – Molecular dynamic homologies. Cladistics 26, 72– phylogenetics, principles and practice. 85. Nature Reviews Genetics 13, 303–314. Voyron S, Roussel S, Munaut F, Varese GC, Zwickl DJ. 2006 – Genetic algorithm Ginepro M, Declerck S, Filipello approaches for the phylogenetic Marchisio V. 2009 – Vitality and analysis of large biological sequence genetic fidelity of white-rot fungi datasets under the maximum likelihood mycelia following different methods of criterion. Ph.D. Dissertation, The preservation. Mycological Research University of Texas at Austin. 113, 1027–1038. Zwickl DJ, Hillis DM. 2002 – Increased taxon Walker DM, Castlebury LA, Rossman sampling greatly reduces phylogenetic AY,White JF. 2012 – New molecular error. Systematic Biology 51, 588–598.

32