Mississippi State University Scholars Junction

Theses and Dissertations Theses and Dissertations

1-1-2011

A genomic and transcriptomic analysis of wood decay and copper tolerance in the brown rot Fibroporia radiculosa

Juliet D. Tang

Follow this and additional works at: https://scholarsjunction.msstate.edu/td

Recommended Citation Tang, Juliet D., "A genomic and transcriptomic analysis of wood decay and copper tolerance in the brown rot fungus Fibroporia radiculosa" (2011). Theses and Dissertations. 143. https://scholarsjunction.msstate.edu/td/143

This Dissertation - Open Access is brought to you for free and open access by the Theses and Dissertations at Scholars Junction. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of Scholars Junction. For more information, please contact [email protected]. Template Created By: James Nail 2010

A GENOMIC AND TRANSCRIPTOMIC ANALYSIS OF WOOD DECAY

AND COPPER TOLERANCE IN THE BROWN ROT

FUNGUS FIBROPORIA RADICULOSA

By

Juliet Dao-May Tang

A Dissertation Submitted to the Faculty of Mississippi State University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Forest Resources in the Department of Forest Products

Mississippi State, Mississippi December 2011 Template Created By: James Nail 2010

Copyright 2011

By

Juliet Dao-May Tang Template Created By: James Nail 2010

A GENOMIC AND TRANSCRIPTOMIC ANALYSIS OF WOOD DECAY

AND COPPER TOLERANCE IN THE BROWN ROT

FUNGUS FIBROPORIA RADICULOSA

By

Juliet Dao-May Tang

Approved:

______Susan V. Diehl Shane C. Burgess Professor of Forest Products Dean of the College of Agriculture and (Director of Dissertation) Life Sciences, University of Arizona (Committee Member)

______Andy D. Perkins M. Lynn Prewitt Assistant Professor of Computer Science Assistant Research Professor of and Engineering Forest Products (Committee Member) (Committee Member)

______Darrel D. Nicholas Daniel G. Peterson Professor of Forest Products Associate Professor of Plant and Soil (Committee Member) Sciences (Committee Member)

______Rubin Shmulsky George M. Hopper Professor and Head of Forest Products Dean of the College of Forest Graduate Coordinator Resources

Template Created By: James Nail 2010

Name: Juliet Dao-May Tang

Date of Degree: December 9, 2011

Institution: Mississippi State University

Major Field: Forest Resources

Major Professor: Dr. Susan V. Diehl

Title of Study: A GENOMIC AND TRANSCRIPTOMIC ANALYSIS OF WOOD DECAY AND COPPER TOLERANCE IN THE BROWN ROT FUN- GUS FIBROPORIA RADICULOSA

Pages in Study: 122

Candidate for Degree of Doctor of Philosophy

Brown rot fungi are notoriously copper-tolerant, which makes them difficult to control with copper-based wood preservatives. Brown rot fungi are also unique because they have evolved a bilateral strategy for decay. Their initial attack involves the produc- tion of hydroxyl free radicals to increase wood porosity, followed by an enzymatic on- slaught of glycoside hydrolases that free the sugars locked within cellulose and hemicel- lulose. Our molecular understanding of these biological processes, however, has been hampered by our limited knowledge of the underlying genetic mechanisms.

To address this knowledge gap, high-throughput, short-read sequencing was used to conduct a comprehensive analysis of the genomics and transcriptomics of wood decay and copper tolerance in the brown rot fungus Fibroporia radiculosa. The results were impressively informative. In the genomic study, the sequences of 9262 genes were pre- dicted and gene function was assigned to 5407 of the genes. An examination of target motifs showed that 1213 of the genes encoded products with extracellular functions. By mining these genomic annotations, 187 genes were identified with putative roles in ligno- cellulose degradation and copper tolerance. Template Created By: James Nail 2010

The transcriptomic study quantified gene expression of the fungus growing on wood treated with a copper-based preservative. At day 31, the fungus was adapting to the preservative, and the wood showed no strength loss. At day 154, the preservative ef- fects were gone, and the fungus was actively degrading the wood, which exhibited 52% strength loss. A total of 917 differentially expressed genes were identified, 108 of which appeared to be regulating wood decay and preservative tolerance. Genes that showed in- creased expression at day 31 were involved in oxalate metabolism, hydroxyl free radical production by the enzyme laccase, energy production, xenobiotic detoxification, copper resistance, stress response, and pectin degradation. Genes that exhibited higher expres- sion at day 154 were involved in wood polysaccharide degradation, hexose transport, ox- alate catabolism, catabolism of laccase substrates, proton reduction, re-modeling the glu- can sheath, and shoring up the plasma membrane for acid shock. These newly discovered genes represent a significant step towards accelerating a genome-wide understanding of brown rot decay and tolerance to wood preservatives.

Key words: systems biology, wood degradation, copper tolerance

DEDICATION

I dedicate this work to my loving husband, David C. Cross.

iii

ACKNOWLEDGEMENTS

I have learned more than I ever could have hoped for during my graduate studies, and I owe it to my mentor, Susan Diehl. She has given me so many opportunities, the most important of which was to take me on as a graduate student, and then to give me a project that I feel has turned me into a real scientist. Because of her generous support, I have been able to publicize my work internationally and meet the icons of wood protec- tion. Moreover, she is an honest person, who truly cares about the success of her stu- dents. I will always admire her for these qualities.

The other person I owe a world of gratitude to is Chuan-Yu Hsu. She is one of my Forever Friends, who also happens to be the best molecular biologist I know. I could always ask her anything, and I know her advice and expertise added enormous value to the quality of my research.

I would also like to thank my co-authors, Shane Burgess, Andy Perkins, Darrel

Nicholas, Tad Sonstegard, and Steve Schroeder. Shane funded the genomics study, Andy was my bioinformatics guru, Darrel helped fund the transcriptomics study and taught me how to run decay tests, and Tad and Steve ran my samples through the Genome Analyzer for the DNA sequence analysis. Their contributions to my research are deeply appreci- ated. Thanks are also due to Cetin Yuceer for letting me use his laboratory when I needed it, to Dan Peterson for telling me that careers are built on genomes, and to Lynn

Prewitt, who served on my committee.

iv

TABLE OF CONTENTS

Page

DEDICATION...... iii!

ACKNOWLEDGEMENTS...... iv!

TABLE OF CONTENTS...... v!

LIST OF TABLES...... vii!

LIST OF FIGURES ...... viii

CHAPTER

I. INTRODUCTION ...... 11!

Wood Protection ...... 11! Biology of Brown Rot Decay ...... 12! Biology of Copper Tolerance...... 15! Knowledge Gaps...... 16! Challenges of the Wood Protection Industry...... 18! Study Objectives ...... 19! References Cited ...... 21!

II. SHORT-READ SEQUENCING FOR GENOMIC ANALYSIS OF THE BROWN ROT FUNGUS FIBROPORIA RADICULOSA...... 25!

Abstract...... 25! Introduction...... 26! Materials and Methods...... 29! Fungus Isolate ...... 29! DNA Library Preparation ...... 29! Short-Read Sequencing...... 30! Stringency Filters...... 31! Assembly...... 31! Optimal Assembly Determination ...... 32! Gene Prediction...... 33! Functional Annotation ...... 34! Computational Analysis...... 36!

v

Results of the Study ...... 36! Quality Control of the DNA Library Preparation...... 36! Pipeline ...... 37! Short-Read Sequencing and Filtering ...... 38! Assembly...... 41! Optimal Assembly Determination ...... 42! Gene Prediction...... 49! Annotations...... 51! Wood Decay and Copper Tolerance Genes...... 53! Discussion...... 58! References Cited ...... 63!

III. TRANSCRIPTOMIC ANALYSIS OF THE BROWN ROT FUNGUS FIBROPORIA RADICULOSA ON WOOD TREATED WITH A COPPER-BASED PRESERVATIVE ...... 67!

Abstract...... 67! Introduction...... 68! Materials and Methods...... 72! Fungus Isolate ...... 72! Decay Tests...... 72! RNA and Reverse Transcription (RT) Products ...... 73! RNA-Seq Libraries ...... 74! Alignment to the Predicted Coding sequence (CDS) ...... 75! Identification of Differentially Expressed Genes ...... 75! Mining for Genes Related to Decay of Preservative-Treated Wood ...... 76! Computational Analysis...... 78! Results of the Study ...... 78! Soil Block Test...... 78! RNA and RT Products ...... 79! Bowtie Alignments ...... 80! Identification of Differentially Expressed Genes ...... 81! Mining for Genes Related to Decay of Preservative-Treated Wood ...... 85! Discussion...... 98! References Cited ...... 112!

IV. CONCLUSIONS ...... 119!

References Cited ...... 122!

vi

LIST OF TABLES

TABLE Page

2.1! Number of reads and N's in the original (DO) and filtered datasets (DF1, DF2, DF3)...... 41!

2.2! Metrics from the original (VO) and filtered (VF1, VF2, VF3) Velvet assemblies that had the maximum N50 values. Half the assembly is covered by contigs of size N50 or larger; k, kmer length; max, maximum...... 43!

2.3! Gene statistics for the optimal F. radiculosa assembly and the P. chrysosporium assembly...... 50!

2.4! TargetP and SignalP analysis of subcellular localization for proteins translated from genes predicted from the F. radiculosa genome...... 51!

2.5! Summary of genes and their roles (italics) in lignocellulose degradation and copper tolerance (oxalate metabolism). Localization was predicted by the SignalP tool. GH#, glycoside hydrolase family number; SP, signal peptide; SA, signal anchor; mfs, major facilitator superfamily; gmc, glucose-methanol-choline...... 55!

3.1! List of the 108 differentially expressed genes (FDR < 1E-4), their fold changes, and their proposed roles (in italics) during decay of wood treated with a copper-based preservative. Red and blue fold change values indicate genes that were more highly expressed at days 31 and 154, respectively. Target motif abbreviations: P, peroxisomal; S, secretory; M, mitochondrial; SP, signal peptide; SA, signal anchor. Target motifs (S, M, SP, and SA) were assigned by the genomic analysis (Chapter II)...... 87!

vii

LIST OF FIGURES

FIGURE Page

2.1! Quality analysis of the genomic DNA as determined by electrophoresis. Undigested DNA (500 ng) and DNA (1 !g) digested with EcoRI or HindIII were loaded in lanes 2, 3, and 4 respectively. The DNA ladder (1 kb Plus, Invitrogen; Carlsbad, CA) was loaded in lane 1...... 36!

2.2! Quality analysis of the steps in the preparation of the genomic library as measured by electrophoresis on an Agilent 2100 Bioanalyzer. (A) Nebulization of the genomic DNA produced a broad smear of fragments < 1500 bp. (B) The library (13 nM) consisted of a narrow range of DNA fragments centered at 305 bp. Size markers were at 15 and 1500 bp. Horizontal bars represent the start and end of peak area integration. FU, fluorescence unit...... 38!

2.3! Pipeline used for genome assembly and annotation in the absence of any reference sequence to align against...... 39!

2.4! Frequency of quality scores (A) and bad scores (score < D) by read position (B) in the original dataset (DO). The quality scores in (A) are arranged in order with the lowest score on the left and the highest score on the right. The last base sequenced was number 76. There was a total of 8.9 Gb in the dataset...... 40!

2.5! Effect of varying k or word size on the N50 of the Velvet assemblies from the unfiltered (DO) and filtered (DF1, DF2, and DF3) datasets. N50 is a measure of the size distribution of contigs in the assembly. Half the assembly is contained in contigs of size N50 or larger. k, kmer length...... 41!

viii

2.6! Two-way Venn analysis of the assembly VO derived from the original dataset DO (A) and the assembly VF1 derived from the first filtered dataset DF1 (B). The dotted outline contains reads in the dataset that were not used by Velvet in the assembly. The solid outline contains reads in the dataset that were removed by the filter. The filter in (A) removed reads with 38 or more bad scores and in (B) removed reads with 1 or more bad scores. The shaded region contains the filtered used reads, that is used in the assembly and removed by the filter. The Venn diagram is not drawn to scale...... 44!

2.7! Quality analysis of the reads in the original dataset that were used by Velvet and removed by the first filter (the shaded region of Figure 2.6.A). The bar charts plot the percent distribution of the quality scores (A), the percent reads with a bad score by read position (B), the frequency of N homopolymers (C), and the frequency of N's per read (D). The first filter removed reads with 38 or more bad scores...... 45!

2.8! Quality analysis of the reads in the first filtered dataset that were used by Velvet and removed by the second filter (shaded region of Fig. 2.6.B). The bar charts plot the percent distribution of the quality scores (A), the percent reads with a bad score by read position (B), and frequency of N homopolymer lengths (C), and the frequency of N's per read (D). The second filter removed reads with one or more bad bases...... 47!

2.9! Results of each stage of the Blast2GO analysis. The number of genes that were annotated (annot), had no annotations (no annot), no mapping, and no blastp hits. The total number of genes is also graphed...... 52!

2.10! Pie chart of the biological process terms found at level two of the directed acyclic graph. The number of genes associated with each term is enclosed in parentheses...... 53!

3.1! Percent compression strength loss (+ SE) of MCQ-treated wafers after exposure to F. radiculosa in an accelerated soil block test (A). RNA-Seq libraries were prepared from samples taken at day 31 (B) and day 154 (C). Arrows point to the MCQ-treated wafer, which was laid on the feeder strips in the soil block test...... 79!

3.2! Typical electropherograms of RNA isolated from day 31 (A) and day 154 (B) MCQ-treated wafers...... 80!

ix

3.3! Multidimensional scaling plot used to identify outlier samples. E1 - E3 and L1 - L4 were the early and late replicates, respectively...... 82!

3.4! MA plot of E1 versus L1 before (A) and after normalization (B). E1 and L1 refer to the first replicate of the early and late samples, respectively. The red line was the estimated trimmed mean of M values and was the adjustment applied to account for the compositional bias in the library size. The orange points had a zero or low count number and were artificially represented at the left edge of the graph...... 83!

3.5! Plot of log fold change (logFC) versus log concentration (logConc). Genes that showed increased expression at the early (E) and late (L) time points had negative and positive logFC values, respectively. The blue lines marked fold change values of 4. Significant logFC values were colored red (FDR < 1E-4). LogConc values close to 0 were highly abundant across all samples. Three points with one or more zero count values (orange) were artificially represented at the left edge of the graph...... 84!

3.6! Differential GO term distribution for the three domains: molecular function (A), biological process (B), and cellular component (C). Each term showed a significant difference in percent genes represented between the highly expressed early and late groups (FDR threshold, 0.05). The percent gene calculation was based on the total number of genes within each group...... 86!

3.7! Differential expression of gene models from (A) the TCA cycle and (B) the GLOX cycle that were regulating oxalate production (FDR = 1E-4). Red, more highly expressed early; blue, more highly expressed late. Gene abbreviations: CS, citrate synthase; AD, aconitate hydratase; AP, succinate/fumarate antiporter; ICL, isocitrate lyase; ODH, 2-oxoglutarate dehydrogenase e1 component; GDH, glyoxylate dehydrogenase; ODC, oxalate decarboxylase; Pathway is modified from Munir et al. (2001)...... 93!

x

CHAPTER I

INTRODUCTION

Wood Protection Wood protection is a major research area in forest products. In 2010, the total revenues for the wood preservation industry amounted to $5.3 billion (Reuters 2011).

Crossties, poles, and pilings have always been the mainstay of the wood preservation in- dustry, but the demand for homeowner construction materials has grown considerably in the last 50 years, and now represents about 70% of the entire market share of all treated wood products (Preston 2000). Since lumber from this market share is destined for resi- dential sales, the bulk of it is treated with water-borne preservatives. The advantages of these preservatives are they are odor-free, easy to paint, and less toxic to handle than wood treated with oil-borne preservatives like creosote and pentachlorophenol.

The primary biocide in water-borne wood preservatives is copper. Copper has high toxicity against the many destructive of wood decay fungi and relatively low toxicity against mammals. The most effective formulations, however, are those that combine copper with other co-biocides to enhance protection against copper-tolerant brown rot fungi. For example, chromated copper arsenate (CCA) was one of the best wa- ter-based wood preservative treatments ever developed. It provided long-term protection of wood (average lifespan 28.7 years) even in ground contact situations where high mois- ture levels promote decay (Lebow 2010). Concerns over the environmental and health effects of arsenate and chromate, however, led industry to voluntarily withdraw CCA

11

from the residential market beginning in 2002 (EPA 2002). Since then, many new CCA

replacements have been sold, most of which combine aqueous or particulate copper with zinc arsenate, quaternary ammonia compounds, tertiary amine compounds, azoles, napthenate, or boric acid (Lebow 2010).

Biology of Brown Rot Decay Brown rot fungi are the primary microbes that decay wood in service (Goodell

2003). They are capable of rapidly depolymerizing cellulose during the early stages of decay (Cowling 1961; Kleman-Leyer et al. 1992), which causes considerable strength loss to wood well before weight loss occurs (Curling et al. 2002). Their efficiency stems from their unique ability to exploit a bilateral strategy to decay wood. Their initial attack involves hydroxyl free radicals to increase wood porosity (Koenigs 1974; Flournoy et al.

1991). The actual breakdown of the polysaccharides to their composite sugars, though, appears to rely mostly on enzymatic cleavage by glycoside hydrolases (Valaskova and

Baldrian 2006). Once the substrate is fully exhausted, they leave behind a modified lig- nin residue. This residue has a cuboidal, checked appearance that is brown in color, giv- ing them the name brown rot fungi (Cowling 1961).

To coordinate the decay process, brown rot fungi control the production of a di- verse number of supporting molecules. As the fungi colonize the lumen of the wood tra- cheid cells, they release low molecular weight mediators that diffuse through the primary wood cell wall to the S2 layer of the secondary cell wall, where they initiate decay

(Daniel 2003). The S2 layer is the thickest part of the tracheid cell wall and is about 54% cellulose (Rowell 2005). The generally accepted theory is that these low molecular weight mediators interact to form hydroxyl free radicals via Fenton chemistry (Fe2+ +

12

+ 3+ H2O2 + H ! Fe + •OH + H2O) (Halliwell 1965; Koenigs 1974; Kirk et al. 1991;

Arantes et al. 2011). These highly reactive but short lived free radicals randomly frag- ment the long cellulose molecules by scission (Kirk et al. 1991). They also cause limited side chain oxidation and demethoxylation of lignin (Arantes et al. 2009a). The net effect of hydroxyl free radical oxidation is an increase in porosity of the lignocellulosic matrix

(Flournoy et al. 1991; Irbe et al. 2006), making the cellulose and hemicellulose accessi- ble to the relatively larger enzymes that hydrolytically cleave the fragmented polysaccha- rides into their composite sugars. Eventually, the hyphae penetrate from the wood cell lumen into the cell walls (Daniel et al. 2007), which probably facilitates the final stages of sugar breakdown and assimilation.

One low molecular weight compound that is believed to mediate Fenton chemis- try is oxalic acid. The roles ascribed to oxalic acid are many and complex. They include

(1) reducing the pH of the environment (Green et al. 1991; Dutton et al. 1993; Humar et al. 2001; Schilling and Jellison 2006), (2) solubilizing metal (Plassard and Fransson

2009), and (3) controlling the oxalate-metal binding dynamic (Varela and Tien 2003;

Arantes et al. 2009b). Results from chemical interaction studies suggest that when ox- alate and iron are bound in a 2:1 or 3:1 ratio, the reduction potential of Fe3+ in the com-

plex is too negative, and the Fe3+ cannot be reduced by iron reductants like hydroqui-

nones and phenolate chelators (Varela and Tien 2003; Arantes et al. 2009b; Wei et al.

2010). Below these ratios, the reduction potential of the Fe3+ in oxalate complex is less

negative, and the iron is reducible. The proposed sequence of events (Arantes et al.

2009b) starts with the iron oxalate complex diffusing away from the hyphae along the

oxalate and pH concentration gradients, the further along the gradient, the greater the

chances that Fe3+ in the complex gets reduced. At some point, binding sites in the wood

13

show greater affinity for the iron, which is now Fe2+, and the Fe2+ moves from the oxalate

complex to the wood (Hammel et al. 2002; Arantes et al. 2009b). The attractiveness of

this hypothesis is that it is not only consistent with the observed chemical relationships,

but it also provides a mechanism that simultaneously protects the hyphae in the tracheid

lumen from oxidative damage, while directing hydroxyl free radical attack towards the

S2 layer of the secondary wood cell wall.

The iron reducing agents that have thus far been isolated include hydroquinones

(Kerem et al. 1999; Jensen et al. 2001; Suzuki et al. 2006), phenolate compounds

(Goodell et al. 1997; Arantes et al. 2011), and a 4 kDa peptide, known as the Gt factor

(because it was isolated from the brown rot fungus, Gloeophyllum trabeum) (Wang et al.

2006). Most of the research on the interaction of oxalate with the iron reducing agents

have focused on the hydroquinones. These investigations have shown that hydroqui-

nones can reduce iron in the absence of oxalate, but they are incapable of reducing iron at

the oxalate concentrations typically maintained by brown rot fungi (Wei et al. 2010).

This apparent discrepancy was unresolved until laccase genes were identified in

another brown rot fungus, Postia placenta (Martinez et al. 2009). Laccases are mono-

phenol oxidases that were known to catalyze one electron oxidation of hydroquinones to

the semiquinone free radicals. The semiquinones are better reductants because they have

more negative reduction potentials than the quinones and have been shown to reduce iron

in the iron oxalate complex (Wei et al. 2010). The semiquinone is also capable of reduc-

ing O2 to form the perhydroxyl free radical (HOO•) that dismutates to H2O2 (Kerem et al.

1999). Thus, in a series of eight proposed reactions, laccase, which catalyzes the first reaction, can generate a complete Fenton system, capable of generating enough hydroxyl free radical to theoretically account for observed decay levels (Wei et al. 2010). As a

14

monophenol oxidase, laccase could also produce free radicals from phenolates (Gianfreda et al. 1999), but this step may not be necessary since phenolates have been shown to re- duce iron directly as long as iron exists in a 1:1 oxalate:iron complex or is bound by wood at pH values between 3.6-5.5 (Arantes et al. 2009b).

Biology of Copper Tolerance Several genera of brown rot fungi are known to be copper-tolerant. They are Fi- broporia, Postia, Laetiporus, Meruliporia, Poria, Wolfiporia, Tyromyces, Serpula, and

Gloeophyllum (Clausen et al. 2000; Green and Clausen 2003; Hastrup et al. 2005; Humar et al. 2006). They exhibit a range of tolerance with the highest levels observed in F. radiculosa and Postia placenta (Green and Clausen 2003). One isolate of F. radiculosa,

TFFH 294 is particularly noteworthy. It was collected in 1971 from field plots located in

Mississippi, where it caused premature failure of stakes treated with acid copper chro- mate (Clausen, personal communication). The acronym, TFFH, in fact stands for "the fungus from hell" and was so named because of its high tolerance to copper-containing preservatives (Clausen, personal communication). It was for this reason that F. radicu- losa TFFH 294 was chosen as the model brown rot fungus for this dissertation.

The generally accepted mechanism of copper tolerance is that brown rot fungi se- crete high amounts of oxalic acid to solubilize, chelate, and irreversibly precipitate the copper in the form of copper oxalate crystals. These conclusions were based on the vis- ual accumulation of copper oxalate crystals (Clausen et al. 2000) and by an increase in oxalic acid levels (Clausen and Green 2003; Green and Clausen 2003) after exposure to copper-treated wood. Crystal formation has also been observed in response to high con- centrations of other metals, such as calcium, zinc, and cobalt (Jarosz-Wilkolazka and

15

Gadd 2003). These results strongly suggest that brown rot fungi actively respond to high metal concentrations by increasing oxalate levels to reduce the bioavailability of the of- fending metal.

Knowledge Gaps Most of the past research on brown rot decay has focused on the chemical and structural changes to the wood. Only recently, has the fungus moved to the forefront.

Early research demonstrated that brown rot fungi use a broad range of enzymes for the final breakdown of cellulose and hemicellulose (Cohen et al. 2005; Valaskova and

Baldrian 2006). They exhibit enzyme activity for endoglucanases, endoxylanases, endo- mannanases, "-glucosidase, "-xylosidase, "-mannosidase, and cellobiohydrolase

(Valaskova and Baldrian 2006). Most brown rot fungi, however, do not have cellobiohy- drolases, per se, and the production of cellobiose has been primarily attributed to the dis- covery of a novel processive endoglucanase that releases cellobiose from crystalline cel- lulose (Cohen et al. 2005).

More recently, a genomic sequencing study of P. placenta provided an estimate of the actual number of genes that could be involved in wood decay. It was quite an eye- opener. For just polysaccharide breakdown, a diverse array of 144 putative glycoside hydrolase genes were identified (Martinez et al. 2009). Experiments designed to identify which of these glycoside hydrolases were being expressed to regulate wood polysaccha- ride degradation, though, were less successful. When gene expression from microarray experiments was compared for fungus grown in liquid cultures of glucose versus crystal- line cellulose (Martinez et al. 2009), relatively few glycoside hydrolase genes were iden- tified because fold changes tended to be small and insignificant. Microarray experiments

16

comparing gene expression on glucose versus balled milled aspen liquid cultures were

somewhat more revealing, but still, only five genes showed substantially higher expres-

sion on aspen, which was low compared to the number of glycoside hydrolase proteins

actually detected by LC MS/MS from culture filtrates (Vanden Wymelenberg et al.

2010). Therefore, the number of glycoside hydrolase genes that are regulating the whole-

sale degradation of cellulose and hemicellulose remains unclear.

The genetic mechanisms that brown rot fungi use to control Fenton chemistry are

even more enigmatic. This is because the Fenton reactants (iron and H2O2) and their me-

diators (oxalic acid and the iron reducing agents) are not themselves gene products. Iron

is an element that is found in the wood and soil, while H2O2 could be produced chemi- cally during the Fenton reaction or through the action of enzymes. Similarly, oxalic acid, hydroquinones, and phenolates are probably biosynthesized de novo or through enzy- matic modification of wood-derived chemicals like the extractives. The point is, in spite of all the past research in wood decay, the genes that control the production of these compounds are still largely unknown.

The recent genomic sequencing analysis of P. placenta has identified many genes that could be involved in producing oxalic acid, laccase, H2O2, hydroquinones, phe-

nolates, or the Gt factor (Martinez et al. 2009). The subsequent gene expression studies,

however, failed to provide strong evidence that these genes were, in fact, regulating pro-

duction of the respective metabolite or in the case of laccase, enzyme (Martinez et al.

2009; Vanden Wymelenberg et al. 2010; Vanden Wymelenberg et al. 2011). What is the

definition of strong evidence? If the hypothesis is that oxalic acid, laccase, and hydro-

quinones are controlling Fenton chemistry, then one would expect expression of the

genes that regulate their production to be up- or down-regulated concurrently. Or, if ox-

17

alic acid production is responsible for copper tolerance, then one would expect the genes that regulate oxalic acid biosynthesis to be up-regulated upon exposure to copper-treated wood and down-regulated once the wood was exhibiting signs of active decay.

Challenges of the Wood Protection Industry A major challenge that the wood protection industry faces is the lack of new bio- cides in the pipeline to replace copper. However, if researchers understood more about the biological processes that brown rot fungi use to decay wood, then researchers could target key molecules for the development of novel wood preservatives. With the recent technological advances in DNA sequencing, this is theoretically possible. By knowing the function of genes that are differentially expressed, one can link genes to the specific biological process they control. For example, suppose we knew that a particular gene encoded a product that controlled a rate-limiting reaction for survival on copper. Chemi- cal databases could then be searched to determine if an inhibitor existed for that particular gene product. There are three major advantages of using a rational approach such as this:

(1) the cost of extensive chemical screening is avoided, (2) databases provide a resource of knowledge that can greatly accelerate the rate of discovery by applying existing infor- mation to new applications, and (3) by using current DNA sequencing technology, scien- tists can address brown rot decay as a systems biology problem. This means we are no longer studying one or several molecules at a time, but can sequence and identify all the genes that make up an organism, or generate profiles of all the genes that are expressed at any one point in time. Thus, there exists the tremendous opportunity for the discovery of novel genes with "never before" described functions and the discovery of metabolic proc- esses that were not previously recognized as being important for the trait being analyzed.

18

Study Objectives The objective of this study was to use short-read sequencing technology to iden-

tify the underlying genetic mechanisms that brown rot fungi use to decay wood and toler-

ate copper-based preservatives. Although the feasibility of my approach was still an open

question, recent advances in sequencing technology suggested that, in theory, the tech-

nique could be applied to fungal-sized genomes. The model organism, F. radiculosa

strain TFFH 294, was chosen for its recognized ability to grow on field stakes treated

with wood preservatives.

The first aim was to sequence the genome. Because next generation sequencing

technologies were rapidly evolving, I had to determine which sequencing options would

maximize my success, given budget constraints and what I knew about the structure of

other fungal genomes. I opted for Illumina paired-end short-read sequencing of a 300 bp

library (76 nt read length). Once these decisions were made, the DNA isolation, library

preparation, and sequencing were fairly routine. Analysis of the 118 million reads in the

dataset, however, presented some formidable challenges. In addition, there were no "off

the shelf" solutions, so I had to piece together an analysis pipeline that included a filter-

ing strategy to remove low quality reads and software tools that could predict gene struc- ture and function. The expected outcome was a comprehensive list of genes with puta- tive functions related to growth, wood decay, and copper tolerance.

The second aim was to identify the genes that were regulating wood decay and tolerance to wood preservatives. My experimental design was a simple time course analysis of decay on wood treated with a preservative called MCQ (micronized copper quaternary compound) in a soil block test. I monitored the decrease in wood strength by compression testing and then used that parameter to judge which two time points should

19

be compared to maximize differences in the gene expression study. RNA was isolated

from the same wood wafer that was tested for compression strength loss. Based on what

I knew from my previous genomic study of gene structure, I chose a singled-end short-

read sequencing strategy known as RNA-Seq (76 nt read length) to generate the tran-

scriptomic data. Again, due to the newness and evolving nature of the technology, I was

continually evaluating software tools as they were being released, so that I could produce

the best analysis possible. The anticipated outcome was a list of differentially expressed

genes with significant fold changes between the two time points being compared.

Moreover, by mining the functions of these genes, I expected to identify genes regulating

wood decay and preservative tolerance, as well as genes controlling related metabolic processes that were not yet discovered.

20

References Cited

Arantes, V., Milagres, A.M., Filley, T.R., Goodell, B. 2011. Lignocellulosic polysaccharides and lignin degradation by wood decay fungi: the relevance of nonenzymatic Fenton-based reactions. J. Ind. Microbiol. Biotechnol. 38: 541-55.

Arantes, V., Qian, Y., Kelley, S.S., Milagres, A.M., Filley, T.R., Jellison, J., Goodell, B. 2009a. Biomimetic oxidative treatment of spruce wood studied by pyrolysis- molecular beam mass spectrometry coupled with multivariate analysis and 13C- labeled tetramethylammonium hydroxide thermochemolysis: implications for fungal degradation of wood. J. Biol. Inorg. Chem. 14: 1253-63.

Arantes, V., Qian, Y.H., Milagres, A.M.F., Jellison, J., Goodell, B. 2009b. Effect of pH and oxalic acid on the reduction of Fe3+ by a biomimetic chelator and on Fe3+ desorption/adsorption onto wood: implications for brown-rot decay. Int. Biodeterior. Biodegrad. 63: 478-83.

Clausen, C.A., Green, F. 2003. Oxalic acid overproduction by copper-tolerant brown-rot basidiomycetes on southern yellow pine treated with copper-based preservatives. Int. Biodeterior. Biodegrad. 51: 139-44.

Clausen, C.A., Green, F., Woodward, B.M., Evans, J.W., DeGroot, R.C. 2000. Correlation between oxalic acid production and copper tolerance in Wolfiporia cocos. Int. Biodeterior. Biodegrad. 46: 69-76.

Cohen, R., Suzuki, M.R., Hammel, K.E. 2005. Processive endoglucanase active in crystalline cellulose hydrolysis by the brown rot basidiomycete Gloeophyllum trabeum. Appl. Environ. Microbiol. 71: 2412-7.

Cowling, E.B. 1961. Comparative biochemistry of the decay of sweetgum sapwood by white-rot and brown-rot fungi. Madison, WI, Technical Bulletin No. 1258. U.S. Department of Agriculture, Forest Service, Forest Products Laboratory.

Curling, S.F., Clausen, C.A., Winandy, J.E. 2002. Relationships between mechanical properties, weight loss, and chemical composition of wood during incipient brown-rot decay. For. Prod. J. 52: 34-39.

Daniel, G. 2003. Microview of wood under degradation by bacteria and fungi. In: Wood Deterioration and Preservation: Advances in Our Changing World, eds. Goodell, B., Nicholas, D.D., Schultz, T.P. American Chemical Society, Washington, pp. 34-71.

Daniel, G., Volc, J., Filonova, L., Plihal, O., Kubatova, E., Halada, P. 2007. Characteristics of Gloeophyllum trabeum alcohol oxidase, an extracellular source of H2O2 in brown rot decay of wood. Appl. Environ. Microbiol. 73: 6241-43.

21

Dutton, M.V., Evans, S.C., Atkey, P.T., Wood, D.A. 1993. Oxalate production by basidiomycetes, including the white-rot species Coriolus versicolor and Phanerochaete chrysosporium. Appl. Environ. Microbiol. 39: 5-10.

EPA. 2002. Chromated copper arsenate: evaluating the wood preservative chromated copper arsenate. Retrieved Sep. 19, 2012 from http://www.epa.gov/oppad001/reregistration/cca/cca_evaluating.htm.

Flournoy, D.S., Kent Kirk, T., Highley, T.L. 1991. Wood decay by brown-rot fungi: changes in pore structure and cell wall volume. Holzforschung 45: 383-88.

Gianfreda, L., Xu, F., Bollag, J.M. 1999. Laccases: a useful group of oxidoreductive enzymes. Bioremediation J. 3: 1-25.

Goodell, B. 2003. Brown-rot fungal degradation of wood: our evolving view. In: Wood Deterioration and Preservation: Advances in Our Changing World, eds. Goodell, B., Nicholas, D.D., Schultz, T.P. American Chemical Society, Washington, pp. 97-117.

Goodell, B., Jellison, J., Liu, J., Daniel, G., Paszczynski, A., Fekete, F., Krishnamurthy, S., Jun, L., Xu, G. 1997. Low molecular weight chelators and phenolic compounds isolated from wood decay fungi and their role in the fungal biodegradation of wood. J. Biotechnol. 53: 133-62.

Green, F., Clausen, C.A. 2003. Copper tolerance of brown-rot fungi: time course of oxalic acid production. Int. Biodeterior. Biodegrad. 51: 145-49.

Green, F., Larsen, M.J., Winandy, J., Highley, T.L. 1991. Role of oxalic acid in incipient brown-rot decay. Mater. Organismen 26: 191-213.

Halliwell, G. 1965. Catalytic decomposition of cellulose substrates. Biochem. J. 95: 35- 40.

Hammel, K.E., Kapich, A.N., Jensen, K.A., Ryan, Z.C. 2002. Reactive oxygen species as agents of wood decay by fungi. Enzyme Microb. Technol. 30: 445-53.

Hastrup, A.C.S., Green, F., Clausen, C.A., Jensen, B. 2005. Tolerance of Serpula lacrymans to copper-based wood preservatives. Int. Biodeterior. Biodegrad. 56: 173- 77.

Humar, M., Bucar, B., Pohleven, F. 2006. Brown-rot decay of copper-impregnated wood. Int. Biodeterior. Biodegrad. 58: 9-14.

Humar, M., Petric, M., Pohleven, F. 2001. Changes of the pH value of impregnated wood during exposure to wood-rotting fungi. Holz Als Roh-Und Werkstoff 59: 288-93.

22

Irbe, I., Andersons, B., Chirkova, J., Kallavus, U., Andersone, I., Faix, O. 2006. On the changes of pinewood (Pinus sylvestris L.) chemical composition and ultrastructure during the attack by brown-rot fungi Postia placenta and Coniophora puteana. Int. Biodeterior. Biodegrad. 57: 99-106.

Jarosz-Wilkolazka, A., Gadd, G.M. 2003. Oxalate production by wood-rotting fungi growing in toxic metal-amended medium. Chemosphere 52: 541-7.

Jensen, K.A., Jr., Houtman, C.J., Ryan, Z.C., Hammel, K.E. 2001. Pathways for extracellular Fenton chemistry in the brown rot basidiomycete Gloeophyllum trabeum. Appl. Environ. Microbiol. 67: 2705-11.

Kerem, Z., Jensen, K.A., Hammel, K.E. 1999. Biodegradative mechanism of the brown rot basidiomycete Gloeophyllum trabeum: evidence for an extracellular hydroquinone-driven fenton reaction. FEBS Lett. 446: 49-54.

Kirk, T.K., Ibach, R., Mozuch, M.D., Conner, A.H., Highley, T.L. 1991. Characterization of cotton cellulose depolymerized by a brown-rot fungus, by acid, or by chemical oxidants. Holzforschung 45: 239-44.

Kleman-Leyer, K., Agosin, E., Conner, A.H., Kirk, T.K. 1992. Changes in molecular size distribution of cellulose during attack by white rot and brown rot fungi. Appl. Environ. Microbiol. 58: 1266-70.

Koenigs, J.W. 1974. Hydrogen peroxide and iron: a proposed system for decomposition of wood by brown-rot basidiomycetes. Wood Fiber Sci. 6: 66-80.

Lebow, S.T. 2010. Chapter 15 Wood preservation. In: Wood Handbook - Wood as an Engineering Material, ed. Ross, R.J. U.S. Department of Agriculture, Forest Service, Forest Products Laboratory, Madison, WI, pp. 1-28.

Martinez, D., Challacombe, J., Morgenstern, I., Hibbett, D., Schmoll, M., Kubicek, C.P., Ferreira, P., Ruiz-Duenas, F.J., Martinez, A.T., Kersten, P., Hammel, K.E., Vanden Wymelenberg, A., Gaskell, J., Lindquist, E., Sabat, G., Bondurant, S.S., Larrondo, L.F., Canessa, P., Vicuna, R., Yadav, J., Doddapaneni, H., Subramanian, V., Pisabarro, A.G., Lavin, J.L., Oguiza, J.A., Master, E., Henrissat, B., Coutinho, P.M., Harris, P., Magnuson, J.K., Baker, S.E., Bruno, K., Kenealy, W., Hoegger, P.J., Kues, U., Ramaiya, P., Lucas, S., Salamov, A., Shapiro, H., Tu, H., Chee, C.L., Misra, M., Xie, G., Teter, S., Yaver, D., James, T., Mokrejs, M., Pospisek, M., Grigoriev, I.V., Brettin, T., Rokhsar, D., Berka, R., Cullen, D. 2009. Genome, transcriptome, and secretome analysis of wood decay fungus Postia placenta supports unique mechanisms of lignocellulose conversion. Proc. Natl. Acad. Sci. USA 106: 1954-59.

Plassard, C., Fransson, P. 2009. Regulation of low-molecular weight organic acid production in fungi. Fungal Biol. Rev. 23: 30-39.

23

Preston, A.F. 2000. Wood preservation - trends of today that will influence the industry tomorrow. For. Prod. J. 50: 13-19.

Reuters. 2011. Research and markets: the US wood preservation industry revenue for the year 2010. Retrieved Sep. 19, 2011 from http://www.reuters.com/article/2011/04/19/idUS223766+19-Apr- 2011+BW20110419.

Rowell, R.M. 2005. Handbook of Wood Chemistry and Wood Composites. Boca Raton, FL, CRC Press.

Schilling, J.S., Jellison, J. 2006. Metal accumulation without enhanced oxalate secretion in wood degraded by brown rot fungi. Appl. Environ. Microbiol. 72: 5662-65.

Suzuki, M.R., Hunt, C.G., Houtman, C.J., Dalebroux, Z.D., Hammel, K.E. 2006. Fungal hydroquinones contribute to brown rot of wood. Environ. Microbiol. 8: 2214-23.

Valaskova, V., Baldrian, P. 2006. Degradation of cellulose and hemicelluloses by the brown rot fungus Piptoporus betulinus - production of extracellular enzymes and characterization of the major cellulases. Microbiology 152: 3613-22.

Vanden Wymelenberg, A., Gaskell, J., Mozuch, M., Sabat, G., Ralph, J., Skyba, O., Mansfield, S.D., Blanchette, R.A., Martinez, D., Grigoriev, I., Kersten, P.J., Cullen, D. 2010. Comparative transcriptome and secretome analysis of wood decay fungi Postia placenta and Phanerochaete chrysosporium. Appl. Environ. Microbiol. 76: 3599-610.

Vanden Wymelenberg, A., Gaskell, J., Mozuch, M., Splinter BonDurant, S., Sabat, G., Ralph, J., Skyba, O., Mansfield, S.D., Blanchette, R.A., Grigoriev, I., Kersten, P., Cullen, D. 2011. Significant alteration of gene expression in wood decay fungi Postia placenta and Phanerochaete chrysosporium by plant species. Appl. Environ. Microbiol. 77: 4499-507.

Varela, E., Tien, M. 2003. Effect of pH and oxalate on hydroquinone-derived hydroxyl radical formation during brown rot wood degradation. Appl. Environ. Microbiol. 69: 6025-31.

Wang, W., Huang, F., Mei Lu, X., Ji Gao, P. 2006. Lignin degradation by a novel peptide, Gt factor, from brown rot fungus Gloeophyllum trabeum. Biotechnol. J. 1: 447-53.

Wei, D.S., Houtman, C.J., Kapich, A.N., Hunt, C.G., Cullen, D., Hammel, K.E. 2010. Laccase and its role in production of extracellular reactive oxygen species during wood decay by the brown rot basidiomycete Postia placenta. Appl. Environ. Microbiol. 76: 2091-97.

24

CHAPTER II

SHORT-READ SEQUENCING FOR GENOMIC ANALYSIS OF THE BROWN ROT

FUNGUS FIBROPORIA RADICULOSA

Abstract The feasibility of short-read sequencing for genomic analysis was investigated for

Fibroporia radiculosa, a copper-tolerant basidiomycete fungus that causes brown rot de- cay of wood. Illumina GAIIx reads from a single run of a paired-end library (76 nt read length, 300 bp fragment size) were filtered to three levels of stringency. The original and the three filtered datasets were each assembled with software tool called Velvet. By combining a Venn analysis with histograms of read quality, we were able to determine which of the four datasets produced a "best" assembly. This assembly had a genome size of 33.6 Mb, N50 = 65.8 kb for k = 51, and maximum contig length of 347 kb. Using

GeneMark, we predicted 9262 genes. TargetP and SignalP analysis showed that among the 1213 genes with secreted products, 986 had motifs for signal peptides and 227 for signal anchors. Functional annotations were assigned to 5407 genes using Blast2Go.

When we mined the annotations, we identified 187 genes with putative roles in lignocel- lulose degradation and oxalate production for copper tolerance. This work demonstrates a significant step towards accelerating a genome-wide understanding of how brown rot fungi decay wood and reduce the bioavailability of copper.

25

Introduction Recent technological achievements in massive parallel sequencing have escalated the rate at which genomes can be sequenced. The most affordable next generation se- quencing platform is the Illumina Genome Analyzer, which uses reversible dideoxy ter- minator chemistry (Bentley et al. 2008). Although read length is significantly shorter than for the traditional Sanger method of dideoxy terminator sequencing, Illumina se- quencing generates millions of reads, thereby providing the extensive coverage needed to produce an assembly. In addition, with each technology upgrade, read length has slowly increased, thereby expanding its capabilities for sequencing the larger, more complex and repetitive genomes of eukaryotes.

Now that the race to sequence genomes has begun, we are gaining a much clearer picture of how Illumina sequencing performs in practice. Three eukaryotic genomes that have been sequenced with either the Illumina technology alone or in combination with other platforms belong to the giant panda bear (Li et al. 2010), the plant pathogenic as- comycete fungus, Grosmannia clavigera (Diguistini et al. 2009), and another ascomy- cete, Sordaria macrospora, which is a model organism for fungal morphogenesis

(Nowrousian et al. 2010). The assembly for the giant panda bear was a formidable feat that not many researchers can duplicate. It required paired-end reads (52 nt read length) from 37 Illumina libraries (insert sizes ranged from 150 kb to 10 kb) to obtain 56-fold coverage of the genome (Li et al. 2010). The draft assembly encompassed 94% of the 2.4

Gb panda genome, predicted 21,000 genes, and led to the discovery of 2.7 million single nucleotide polymorphisms.

On a more modest scale, the G. clavigera fungal sequencing project showed that paired-end Illumina reads (42 nt) could be effectively combined with Sanger and Roche

26

platforms to produce a genomic assembly that improved significantly, as measured by an

N50 to assemblies that lacked the Illumina reads (Diguistini et al. 2009). The N50 is a

measure of an assembly's quality, since by definition, half the assembly is covered by

contigs of size N50 or larger. The core assembly was created from paired-end Illumina

reads (42 nt read lengths from a 200 bp library). Scaffolds were then built using overlaps

with Sanger paired-end reads (average read length of 600 nt from a 40 kb fosmid library)

and Roche single-end reads (average read length of 100 and 225 nt). Reads were

trimmed to remove low quality sections and the extent of trimming required was evalu-

ated by counting the number of mis-assemblies. This draft of the G. clavigera genome was 32.5 Mb, had an N50 of 32 kb, a scaffold N50 of 558 kb, and a total of 162 gaps.

The genome was later manually finished with more sequence data and gap-filling reac-

tions, validated with expressed sequence tag data, and then used to predict genes, obtain

functional annotations, and used for transcriptomic analysis (DiGuistini et al. 2011).

The S. macrospora assembly was another notable example. It relied solely on the

Illumina and Roche next generation sequencing platforms. By varying the number of

reads used in the assembly from each platform, the authors showed that the Illumina

paired-end reads provided gains in N50 and max contig length, while the Roche reads

dramatically decreased the number and length of gaps (Nowrousian et al. 2010). The Il-

lumina reads (36 nt) were primarily paired-end reads from two insert libraries (300 bp

and 500 bp) and provided about 85-fold coverage. The Roche single-end reads (367 nt

average read length) were sequenced to 10-fold coverage. The draft genome was esti-

mated to be about 40 Mb. The Illumina assembly had N50 of 51 kb, and 17,956 gaps.

By adding the Roche reads, the N50 increased to 117 kb, and the number of gaps dropped

to 624. By taking advantage of known syntenic regions between S. macrospora and the

27

Neurospora crassa, the accuracy of the five largest contigs was verified. In addition,

syntenic regions were used to extend the contigs and a scaffold N50 of 498 kb was ob-

tained.

Given these successes, we were highly motivated to sequence the genome of Fi-

broporia radiculosa (= radiculosa). F. radiculosa is a basidiomycete fungus

that causes brown rot decay of wood and is known to be copper-tolerant (Green and

Clausen 2003). Eumycota, which includes Ascomycota and , tend to have

streamlined genomes that contain <5% repetitive DNA (Wostemeyer and Kreibich 2002).

The less repetitive the genome, then the greater the chances are of obtaining an assembly

from Illumina short-read sequencing data. Currently, we know very little about the ge- netic mechanisms used by brown rot fungi to aggressively degrade the structural polysac- charides of wood or overcome copper-based wood preservatives. Our goal was to predict as many genes as possible from an assembly based on paired-end reads (76 nt) from se- quence analysis of one Illumina flow cell run using a single library (300 bp). Further- more, we developed a rational approach for read filtering without a reference genome for alignment, and used this approach to find an optimal assembly. The contigs from this assembly (> 3kb) were used to predict genes and estimate genome size. Comparisons to gene sequences in public databases then allowed us to determine Gene Ontology (GO) and function. Since our long-term objective is to understand the genetic mechanisms in- volved in wood biodegradation and copper tolerance, we mined the functional annota- tions for genes related to the oxidative and hydrolytic decay of lignocellulose and for genes related to oxalate production since brown rot fungi are able to remove copper by increasing extracellular oxalate levels (Green and Clausen 2003) and forming copper ox- alate crystals (Clausen et al. 2000).

28

Materials and Methods

Fungus Isolate F. radiculosa strain TFFH 294 was kindly provided by Carol Clausen, USDA

Forest Service, Forest Products Laboratory, Madison WI. The identity of the strain was verified by cloning the ITS region between 18S and 28S rRNA genes after amplification with ITS1 and ITS4 primers (White et al. 1990). The sequenced DNA aligned to two F. radiculosa voucher specimens in the NCBI nucleotide database confirming its identity.

For the DNA isolation, the fungus was grown for 30 days in potato dextrose broth (125 rpm at 25°C). The mycelia were harvested by filtration, rinsed first with 100 mM Tris-

HCl and 5 mM EDTA, pH 8.0, then with 70% ethanol. The mycelia were transferred to tubes with 70% ethanol and stored at -80°C.

DNA Library Preparation Nuclear DNA was extracted using a method originally developed for cotton

(Paterson et al. 1993). Briefly, mycelia (1.38 g wet weight) was ground in liquid nitrogen with a mortar and pestle. After adding the extraction buffer (0.35 M glucose, 0.1 M Tris-

HCl pH 8.0 5 mM EDTA pH 8.0, 2% (w/v) polyvinylpyrrolidone 40, 0.1% (w/v) dieth- yldithiocarbamic acid sodium salt, 0.1% (w/v) ascorbic acid, 0.2% (v/v) 2- mercaptoethanol), the nuclei were pelleted then resuspended in lysis buffer (0.1M Tris-

HCl pH 8.0, 1.4 M NaCl, 20 mM EDTA pH 8.0, 0.2% (w/v) CTAB, 0.1% (w/v) dieth- yldithiocarbamic acid sodium salt, 0.1% (w/v) ascorbic acid, 0.1% (v/v) 2- mercaptoethanol). Proteins were removed by phenol chloroform extraction, and the

DNA was precipitated with ice-cold isopropanol. The DNA was resuspended in water and treated with RNase A (200 ng/!l, 1 hr at room temperature) followed by another

29

phenol chloroform extraction. Polysaccharides were removed by adding 0.3 volumes of

cold ethanol, incubating on ice for 10 min, and then centrifuging at 7000 x g for 10 min.

The purified DNA from the supernatant was precipitated overnight at -20°C with 1/10

volume 3 M sodium acetate (pH 6.0) and 2 volumes of 95% ethanol. Following centrifu-

gation (10,000 x g for 15 min), the pelleted DNA was washed with 70% ethanol twice to

remove salt residues, then resuspended in TE buffer (10 mM Tris-HCl and 1 mM EDTA,

pH 8.0). The concentration was measured on a Nanodrop 1000 spectrophotometer. The

purity was evaluated by digesting the DNA with EcoRI and HindIII, and electrophoresing

the products on a 1% agarose gel in 1x TAE buffer (40 mM Tris-acetate, 1 mM EDTA,

pH 7.6).

The library was prepared from 10 !g of DNA (Illumina Genomic DNA Sample

Prep Kit; San Diego, CA) according to the protocols provided with the kit. Microfluidic

chip electrophoresis on an Agilent 2100 Bioanalyzer (DNA 1000 kit; Santa Clara, CA)

was used to determine the size range of the products produced by nebulization (6 min at

34 psi), and the concentration and size range of the library after PCR enrichment. An ali- quot of the library was cloned into the pGEM-T Easy vector (Promega; Madison, WI) and five clones were sequenced using fluorescently-labeled dideoxy terminators on a

Beckman CEQ 8000 (Beckman Coulter; Brea, CA).

Short-Read Sequencing. Short-read sequencing of our library was performed on a single paired-end flow cell of an Illumina Genome Analyzer IIx (7 lanes of library, 1 lane of "X control). Raw sequence data was processed using Firecrest (image analysis) and Bustard (basecalling) as part of the Illumina GAPipeline v1.4.0. The sequence data obtained was in SCARF

30

format (Solexa Compact ASCII Read Format), which included the information for each read on one line separated by colons (sequencer name, lane number, tile number, x and y cluster coordinates, paired-read designation, base calls, and ASCII quality scores).

SCARF files were converted to FASTQ and FASTA format. FASTQ files were used for quality analysis and filtering, and FASTA files were used as input for the genome assem- bly tool.

Stringency Filters The dataset was filtered to three different levels of stringency. The quality scores in the original dataset ranged from the worst score of B to the best score of b. The scores follow the same character order as in the ASCII table, but exclude the character C. B is separated from the other quality scores because it is a read segment score that denotes stretches of the read that are of low quality. We defined a bad score as any score less than D. Our lowest stringency filter (F1) discarded reads with 38 or more bad scores, the moderate stringency filter (F2) discarded reads with one or more bad scores, and the most stringent filter (F3) discarded reads with one or more N base calls. The original dataset was denoted DO and the progressively filtered datasets as DF1, DF2, and DF3.

Assembly The short-read assembly tool we used was Velvet 0.7.55 (Zerbino and Birney

2008). To meet the Velvet FASTA file input requirements, paired-end reads were merged into one file and single-end reads were input in separate files. The latter were generated during filtering, when only one read of the pair was discarded. For each of the four datasets (DO, DF1, DF2, DF3), the velveth command was run with 8 or 9 different kmer (k) hash values. Velvetg commands were executed with the following options:

31

exp_cov auto, min_contig_lgth 100, and ins_length 300. The assembly with the maxi-

mum N50 value for each dataset was designated: VO, VF1, VF2, and VF3. Metrics for

each of these assemblies were obtained from the Velvet output. To obtain a FASTA file of the unused reads for VO, VF1, VF2, and VF3, the option, unused_reads yes, was specified in the velvetg command. Unused reads had no kmers in the contigs of the as-

sembly. The stats.txt file generated by Velvet for the VO, VF1, VF2, and VF3 assem-

blies was used to plot a frequency histogram of contig coverage weighted by contig

length. This histogram verified that the average contig coverage and contig coverage cut- off values estimated by the auto option coincided, respectively, with the peak and the start of the peak in the Poisson distribution (data not shown). Velvet assemblies were all performed on a Sun SPARC Enterprise T5120 server equipped with 64 GB RAM. Each assembly took less than 24 hrs, and used about 30 GB RAM.

Optimal Assembly Determination Of VO, VF1, VF2, and VF3, the optimal assembly was selected based on an analysis of the quality of the reads used in an assembly and removed by the next level of filtering. The reads in this region were defined by a two-way Venn analysis. For exam- ple, the original dataset (DO) included reads that were used in the assembly and reads that were filtered to produce the next dataset D1. The Venn diagram identified the reads that were used in the assembly and removed by the next filter. The quality of the reads in this region (filtered used reads) was assessed by plotting percent distribution of the qual- ity scores, percent distribution of reads with bad scores at each read position, frequency of N homopolymers, and frequency of N's per read. The optimal assembly determined

32

from this analysis was then used for gene prediction, subcellular localization, and func- tional annotation.

The Venn analysis was necessary because Velvet does not output a file with only the used reads. The three files we used were the Velvet sequence FASTA file (Illumina and Velvet read identifiers), which contained the reads of the dataset, the Velvet unused reads FASTA file (Velvet identifier only), and the FASTQ file of the reads that were re- moved by the next filter (Illumina identifier only). Hash tables were used to match all reads to their Illumina identifier. The next step was to find the filtered used reads. This was accomplished by searching for the Illumina identifiers of the filtered reads in a hash table of the Illumina identifiers of the unused reads. If the identifier was not found, then the read belonged to the group of filtered used reads.

In order to get an estimate of the number of high quality reads (no B or D scores and no N's) that were not used by Velvet in any of the assemblies (VFO, VF1, VF2, and

VF3), we determined the number of reads that were in the 4-way intersection of the un- used reads from all four assemblies. In this case, once the Illumina identifiers were linked with each unused readset, each readset was hashed. The reads that existed in all four hashes defined the 4-way intersection.

Gene Prediction GeneMark-ES v2 was used for gene prediction (Ter-Hovhannisyan et al. 2008).

We ran the tool first with the long contigs that met the GeneMark requirements (contigs >

20 kb and a minimum contig sum > 10 Mb) to get the long contig gene predictions, then

with the long and short contigs (> 3 kb) to get the predictions of just the short contigs.

Genes were not predicted from contigs < 3 kb. Partial genes, i.e. that lacked a start or

33

stop codon, were removed from the GTF GeneMark output file before converting it to the

GFF file format. The gtf2gff3 script for the conversion was downloaded from the Se-

quence Ontology website (Sequence Ontology Project 2008). Gene and coding se-

quences (CDS) were then extracted from the GFF file, and the corresponding protein se-

quences were translated. Descriptive statistics for gene and CDS sequences were ana-

lyzed and compared with the genome of the white rot fungus, Phanerochaete chrysospo-

rium (Vanden Wymelenberg et al. 2006). Currently, there are no published gene statis-

tics for brown rot fungi.

TargetP 1.1 and SignalP 3.0 for eukaryotes were used to determine the subcellu- lar localization of the gene products (Emanuelsson et al. 2007). Both required only the first 100 amino acids of the open reading frame as input. TargetP determined if the pro-

tein was secretory or localized to the mitochondria and SignalP predicted the presence of

a signal peptide or membrane anchor. The SignalP hidden Markov model tool was used

because we wanted the highest sensitivity to detect signal peptides (98.4 vs 85.0%, re-

spectively) and accepted the higher false positive rate (3.4 vs 0.4%, respectively)

(Emanuelsson et al. 2007).

Functional Annotation The Blast2GO suite (v 2.4.9) was used to automate the process of functional an-

notation (Gotz et al. 2008). The analysis was performed in July of 2011. The order of

analysis was: blastp, map, annotate, InterPro search, mergo GO, and GOSlim. Blastp

retrieved the top 20 hits in the NCBI nr database. Hits were mapped by four methods to

obtain their associated GO terms. The annotate rule found the most specific GO terms

and their reliability, then additional terms were retrieved based on conserved domain

34

searches from the InterPro database. All GO terms were merged and condensed to their broad functional categories using the GO-Slim generic database. Query and graphing tools within the Blast2GO suite were used to summarize the results of the annotations and to produce the directed acyclic graph for biological process.

To assess the value of this genomic resource that we created, we identified genes that could be related to wood decay. We mined the data for annotations related to the hy- drolytic breakdown of pectin, hemicellulose, and cellulose (glycoside hydrolases), sugar transport, copper tolerance (genes involved in oxalate metabolism), lignin modification

(laccases and multicopper oxidases), and oxidative breakdown of wood by the Fenton reaction (genes involved in H2O2 metabolism, iron reduction, and quinone redox cycling).

Experimental evidence for the catalytic activities of the glycoside hydrolase families were obtained from the carbohydrate-active enzymes database (Cantarel et al. 2009). We also searched for genes that were not found in our annotations, but had wood decay func- tions in other species. In P. chrysosporium, these were genes for the low molecular weight glycoproteins (encoded by glp1 and glp2) (Tanaka et al. 2007), and for cellobiose dehydrogenase, both of which are involved in iron reduction (Li et al. 1997). In brown rot fungi, the genes were for a low molecular weight peptide isolated from Gloeophyllum trabeum called the Gt factor, which has iron reducing capabilities (Wang et al. 2006), and a gene for an oxalate efflux transporter identified from Fomitopsis palustris (Watanabe et al. 2010). For each gene, we performed a blastp search (Altschul et al. 1997) of the pro- tein product against our F. radiculosa database, which was derived from contigs > 3 kb.

35

Computational Analysis All software tools mentioned were freely downloaded from the Internet. Custom

Perl (Schwartz and Phoenix 2001) and BioPerl (Stajich et al. 2002) scripts were written by J.D. Tang, unless otherwise specified. Descriptive gene statistics were computed in R

(R Core Development Team 2011).

Results of the Study

Quality Control of the DNA Library Preparation The yield of our genomic DNA isolation was 69.8 !g/g wet mycelia. Agarose gel electrophoresis (Figure 2.1) of the undigested DNA (500 ng) showed that the DNA was intact and free of RNA (lane 2). Following digestion of 1 ug DNA with EcoRI (lane 3) or HindIII (lane 4), a smear of fragments smaller than the undigested DNA was obtained.

These results indicated that the DNA was suitable for the library preparation.

kb 1 2 3 4

12.0

4.0 3.0

2.0 1.6

1.0

0.5

0.1

Figure 2.1 Quality analysis of the genomic DNA as determined by electrophoresis. Undigested DNA (500 ng) and DNA (1 !g) digested with EcoRI or HindIII were loaded in lanes 2, 3, and 4 respectively. The DNA ladder (1 kb Plus, Invitrogen; Carlsbad, CA) was loaded in lane 1. 36

After nebulization (6 min at 34 psi) and silica spin-column cleanup, the sheared

genomic DNA (10 !g) produced a broad smear of fragments less than 1500 bp (Figure

2.2.A). Samples that were not properly fragmented produced a noticeable peak between

700 and 1500 bp after chip electrophoresis (data not shown). The library after PCR en- richment exhibited a concentration of 13 nM and was comprised of a narrow range of

DNA fragments centered about 305 bp (Figure 2.2.B). The small peak at 69 bp was probably residual primer from the enrichment step. These primers do not affect the sub- sequent cluster formation on the flow cell (Illumina 2008). When five of the cloned li- brary inserts were sequenced, three gave alignments to another brown rot fungus, Postia placenta using tblastx (Altschul et al. 1997) against the NCBI nucleotide database and the other two failed to align (data not shown). These results provided preliminary evidence that our library was pure enough for sequencing, contained the desired fragment size (300 bp), and was of fungal origin.

Pipeline An overview of the steps we developed for genome assembly and annotation in the absence of expressed sequence tag or reference sequence is shown Figure 2.3. The first goal was to identify an optimal assembly from the progressively filtered readsets

(Figure 2.3.A). The second goal was to predict genes from the contigs in the optimal as-

sembly then use the in silico protein translations to determine subcellular localization and

function (Figure 2.3.B). Functional annotations were by inference, that is, similarity to

sequences in public databases implied similar function. The potential value of the ge-

nomic resource was then appraised by mining for functions related to the oxidative and

hydrolytic decay of lignocellulose.

37

A

B

Figure 2.2 Quality analysis of the steps in the preparation of the genomic library as measured by electrophoresis on an Agilent 2100 Bioanalyzer. (A) Nebuli- zation of the genomic DNA produced a broad smear of fragments < 1500 bp. (B) The library (13 nM) consisted of a narrow range of DNA frag- ments centered at 305 bp. Size markers were at 15 and 1500 bp. Horizon- tal bars represent the start and end of peak area integration. FU, fluores- cence unit.

Short-Read Sequencing and Filtering Error rates of the "X control for each paired-end read were 1.20 and 1.09%. The frequency of the base quality scores for the original dataset is shown in Figure 2.4.A.

There was a total of 8.9 Gb in the dataset with the majority (>2.5 Gb) exhibiting high quality scores. About 0.5 Gb, however, had the lowest quality score of B. Having de- fined a bad score as < D, a histogram of the frequency of bad scores by read position

38

showed an exponential increase, indicating that the relationship was cumulative, i.e. once

a read went bad, then the rest of the read was also likely to be bad (Figure 2.4.B).

A B

Short read Optimal sequencing data assembly

Predict genes Filter to different stringencies

Predict proteins Assemble and find max N50

Localize Venn analysis to Identify gene find reads in blastp hits the used and products filtered set

Get gene Quality analysis ontology of used and annotations filtered readset

Mine for Identify an functions optimal assembly related to wood decay

Figure 2.3 Pipeline used for genome assembly and annotation in the absence of any reference sequence to align against.

39

A

B

Figure 2.4 Frequency of quality scores (A) and bad scores (score < D) by read posi- tion (B) in the original dataset (DO). The quality scores in (A) are arranged in order with the lowest score on the left and the highest score on the right. The last base sequenced was number 76. There was a total of 8.9 Gb in the dataset.

The number of reads and number of N's in each dataset are listed in Table 2.1.

DO had 117.7 M reads. F1 removed 4.0 M reads and 5.3 M N's. F2 removed 50.9 M reads and 7.0 M N's, and F3 removed about 49,000 reads and 49,700 N's. Using a ge- nome size of 33 Mb, the potential nucleotide coverages for DO, DF1, DF2, and DF3 were approximately: 271x, 262x, 145x, and 145x.

40

Table 2.1 Number of reads and N's in the original (DO) and filtered datasets (DF1, DF2, DF3).

Dataset Number of Reads Number of N's

DO 117,745,354 12,363,215

DF1 113,780,324 7,039,088

DF2 62,927,297 49,736

DF3 62,878,344 0

Assembly The effects of varying k (word size) on the N50 of the Velvet assemblies from the original DO, and filtered datasets, DF1, DF2, and DF3, are shown in Figure 2.5. In gen- eral, the lowest N50 values were at the extremes of the k values tested, with one maxi- mum N50 at some intermediate value of k. An unexplained drop in N50, however, was observed in the DF2 assembly at k = 35. Maximum N50 values for DO, DF1, DF2, and

DF3 were at k equal to 45, 51, 37, and 37, respectively.

Figure 2.5 Effect of varying k or word size on the N50 of the Velvet assemblies from the unfiltered (DO) and filtered (DF1, DF2, and DF3) datasets. N50 is a measure of the size distribution of contigs in the assembly. Half the as- sembly is contained in contigs of size N50 or larger. k, kmer length. 41

When we compared the average contig coverage and coverage cutoff values se- lected by Velvet (using the exp cov auto option of the velvetg command) with the Pois- son distribution obtained by plotting frequency of the coverage weighted by contig length, we found that Velvet chose the correct values as long as k corresponded to the max N50 assembly of a dataset. If k drifted to the more extreme values, the auto option was prone to error (data not shown).

Optimal Assembly Determination Based on the assembly metrics alone, it was difficult to assess which of the four assemblies with max N50 was optimal (Table 2.2). The VO assembly had the greatest maximum N50 (66.2 kb) and the highest contig coverage (61.6x). The VF1 assembly, on the other hand, had the largest k (51) and longest maximum contig length (347 kb).

Maximum N50 values of VF2 and VF3 were both about 1/3 that of VO and VF1, and maximum contig length were both between 1/3 and 1/2 of VO and VF1. The estimated genome size ranged from 30.9 to 33.6 Mb, with the largest from VF1. The number of reads used in the VO and VF1 assemblies was not quite twice the number of used reads from VF2 and VF3, and the number of unused reads was about 1/3 the number of used reads, regardless of assembly. Based on the number of used reads and estimated genome size, the actual nucleotide coverages were: 199x for VO, 192x for VF1, 113x for VF2, and VF3.

42

Table 2.2 Metrics from the original (VO) and filtered (VF1, VF2, VF3) Velvet assem- blies that had the maximum N50 values. Half the assembly is covered by contigs of size N50 or larger; k, kmer length; max, maximum.

Max Estimated Used Unused Max Contig Contig Genome Reads Reads Assembly k N50 Coverage Length Size (M) (M) (kb) (mean) (kb) (Mb)

VO 45 66.2 61.6 341.2 33.1 86.6 31.1

VF1 51 65.8 57.0 347.0 33.6 85.1 28.7

VF2 37 23.7 50.1 148.1 31.0 46.1 16.8

VF3 37 24.0 52.5 148.1 30.9 46.1 16.7

The two-way Venn analysis, however, provided a more concrete guide for identi- fying an optimal assembly (Figure 2.6). This was because we were able to use the distri- bution of quality scores and N's of the filtered used reads to directly assess how the filters were affecting the assemblies. For the VO assembly, there were 28.2 M unused reads that were not filtered, 2.9 M unused reads that were filtered (moot reads), and 1 M fil- tered used reads (Figure 2.6.A). The intersection of the two circles are called moot reads because they were already removed by Velvet and so removal by the filter was redun- dant. For the VF1 assembly, there were 14.0 M unused reads that were filtered, 14.7 moot reads, and 36.2 M filtered used reads (Figure 2.6.B).

43

A B

28.2 2.9 1.0 14.0 14.7 36.2

Velvet unused reads Filtered reads x 106 Filtered used reads

Figure 2.6 Two-way Venn analysis of the assembly VO derived from the original dataset DO (A) and the assembly VF1 derived from the first filtered dataset DF1 (B). The dotted outline contains reads in the dataset that were not used by Velvet in the assembly. The solid outline contains reads in the dataset that were removed by the filter. The filter in (A) removed reads with 38 or more bad scores and in (B) removed reads with 1 or more bad scores. The shaded region contains the filtered used reads, that is used in the assembly and removed by the filter. The Venn diagram is not drawn to scale.

Closer examination of the 1 M filtered used reads of the VO assembly showed that they were of very low quality (Figure 2.7). The majority (58%) of the bases had bad scores (B or D) (Figure 2.7.A). In addition, more than 50% of the reads had bad scores starting at read position 35, and 100% of the reads had bad scores starting from read posi- tion 44 (Figure 2.7.B). There was an abundance of long N homopolymers > 5 (Figure

2.7.C) and N's per read > 5 (Figure 2.7.D). There were about 6000 N homopolymers greater than a 35-mer, and there were 6400 reads that had more than 40 N's per read. The frequencies of Nmers of lengths 1 to 4 ranged from 71,564 for Nmer =1 to 1147 for

Nmer = 4. The frequencies of 1 to 4 N's per read, ranged from 52,268 to 307, respec- tively. Since k equals 45 for the VO assembly, the kmers from these used reads would certainly contain long stretches of poor quality sequence. Therefore, we reasoned that the

44

accuracy of the VO assembly could be improved by removing these 1 M reads with F1.

This was corroborated by the assembly data where we saw an increase in specificity or k of the VF1 assembly (k = 45 for VO and k = 51 for VF1) with only a minor reduction in max N50 (max N50 = 66.2 for VO and max N50 = 65.8 for VF1).

A

B

Figure 2.7 Quality analysis of the reads in the original dataset that were used by Vel- vet and removed by the first filter (the shaded region of Figure 2.6.A). The bar charts plot the percent distribution of the quality scores (A), the percent reads with a bad score by read position (B), the frequency of N ho- mopolymers (C), and the frequency of N's per read (D). The first filter re- moved reads with 38 or more bad scores.

45

C

D

Figure 2.7 (continued)

Figure 2.8 charts a similar analysis for the 36.2 M filtered used reads of the VF1

assembly. This time, however, the majority of the reads were of acceptable quality.

Only 10% of the bases had bad scores (Figure 2.8.A), and the percent reads with a bad

score did not exceed 50% until read position 74 (Figure 2.8.B). Even if we consider that

10% of 36.2 M reads equals 3.62 M reads with bad scores, the frequency of N ho-

mopolymers > 5 (Figure 2.8.C) and N's per read > 5 (Figure 2.8.D) were extremely low.

The longest N homopolymer was a 22-mer, and there were only 27 N homopolymers greater than a 10-mer (Figure 2.8.C). The max N per read was 26, and there were only

595 reads with more than 10 N's per read (Figure 2.8.D). The frequencies of Nmers < 5 were: 3.4 M for Nmer = 1; 53,083 for Nmer = 2; 757 for Nmer = 3; and 287 for Nmer =

46

287. The frequencies of N per read < 5 were: 3.3 M for 1 N per read; 116,784 for 2 N's per read; 5,872 for 3 N's per read; and 1,539 for 4 N's per read. Given that the total num- ber of bases in the contigs of the VF1 assembly was 6.5 Gb, it was unlikely that these single N's would affect the assembly since they were well below the 1% single nucleotide polymorphism rate that was shown not to affect accuracy of the Velvet assembly in tests with simulated data (Zerbino and Birney 2008). Therefore, we concluded that VF1 was our optimal assembly. This made it unnecessary to apply F3 or run the associated Venn analysis. The assembly metrics corroborated this conclusion. Comparing VF1 to VF2, we observed a 1.4x loss of specificity (k dropped from 51 to 37, respectively) and a 2.8x decrease in maximum N50 (65.8 and 23.7, respectively).

A

Figure 2.8 Quality analysis of the reads in the first filtered dataset that were used by Velvet and removed by the second filter (shaded region of Fig. 2.6.B). The bar charts plot the percent distribution of the quality scores (A), the percent reads with a bad score by read position (B), and frequency of N ho- mopolymer lengths (C), and the frequency of N's per read (D). The second filter removed reads with one or more bad bases.

47

B

C

D

Figure 2.8 (continued)

The distribution of contigs in this optimal assembly VF1 were: 361 contigs > 20

kb, 500 contigs > 3 kb and < 20 kb, and 16,001 contigs > 100 bp and < 3 kb. The 861 contigs that were > 3 kb covered 28.4 Mb, or 84.5% of the genome and the estimated ge-

nome size was 33.6 Mb. Of these 861 contigs, 392 had no internal gaps while the re-

48

maining 469 contigs had 161,121 N's. It is important to distinguish these gap N's from

the ambiguous base call N's in the dataset. Velvet converts N's in the datasets to A's,

while the N's in the assembly represent gaps between the paired-end reads where there

was not enough coverage to determine the sequence. Alignment of the adapter sequences

(61 and 57 nt long) to the contigs with blastn (Altschul et al. 1997) showed only one 16

bp match, verifying that there was no adapter contamination in the assembly.

Results of the 4-way Venn comparison showed that 11.6 M reads were in the in-

tersection of the unused reads from the four assemblies VO, VF1, VF2, and VF3. These

were reads that had no bad scores and no N's, yet were not included in any of the contigs

of the Velvet max N50 assemblies.

Gene Prediction A total of 9262 genes were predicted from the VF1 assembly. Long contigs (> 20

kb) and short contigs (> 3 kb and < 20 kb) produced 8137 and 1125 gene predictions, re- spectively. An additional 257 and 109 partial genes were predicted from the long and short contigs, respectively, but were excluded from subsequent analysis. The gene de- scriptive statistics for our optimal assembly are listed in Table 2.3, alongside values pub- lished for P. chrysosporium (Vanden Wymelenberg et al. 2006). The two species exhib- ited close overall similarity. The F. radiculosa genome was 1.5 Mb smaller, had 786 fewer genes, exhibited an average gene, CDS length, and intron length that was longer by

150, 66, and 6 bp, respectively. The average exon length was shorter by 13 bp, and the average number of exons per gene was comparable with 0.6 exons per gene more for F. radiculosa than P. chrysosporium. The percent GC content of the CDS differed by only

0.6%.

49

Table 2.3 Gene statistics for the optimal F. radiculosa assembly and the P. chrysospo- rium assembly.

Descriptive Statistic Fra Pch*

Genome size Mb 33.6 35.1

Number of genes 9,262 10,048

Avg gene length bp 1817 1667

Avg CDS length bp 1432 1366

Avg intron length bp 70 64

Avg exon length bp 221 234

Avg number of exons/gene 6.5 5.9

Percent GC of CDS 53.8 53.2

*Gene statistics for the white rot fungus were taken from the version 2.0 assembly of P. chrysoporium (Vanden Wymelenberg et al. 2006). Fra, F. radiculosa; Pch, Phanero- chaete chrysosporium.

The results of the TargetP and SignalP analysis of subcellular localization for pro-

teins translated from genes predicted from the optimal assembly are shown in Table 2.4.

TargetP recognized 287 gene products as being localized to the mitochondria and 1213 as

being secreted. For the gene products localized to the mitochondria, SignalP identified

188 with signal peptides and 99 with membrane signal anchors. For the secreted prod-

ucts, SignalP found 986 with signal peptides and 227 with signal anchors. SignalP found

another 250 gene products not localized by TargetP of which 103 had signal peptides and

148 had signal anchors, for a total of 1750 proteins localized. The difference between a

signal peptide and a signal anchor is that the latter does not have a cleavage site and so

the protein would remain attached to a membrane. It is important to recognize that not all

50

proteins with signal peptides are secreted and not all secreted proteins carry signal pep-

tides. Regardless, TargetP and SignalP provide the first clues to the subcellular destina-

tion of a protein in the cell, which is an integral part of understanding protein function.

Table 2.4 TargetP and SignalP analysis of subcellular localization for proteins trans- lated from genes predicted from the F. radiculosa genome.

SignalP SignalP TargetP Signal Peptide Signal Anchor Totals Compartment (n) (n)

Mitochondria 188 99 287

Secreted 1088 375 1213

-- 102 148 250

Totals 1276 474 1750

Annotations Results of the Blast2GO analysis (9262 total predicted genes) showed that 5407

(58%) had GO annotations, 985 (11%) were mapped but had no annotations, 1833 (20%)

could not be mapped, and 1037 genes (11%) had no product matches in the NCBI nr da-

tabase (E-value threshold set to 1E-3) (Figure 2.9). Overall, most of the translated pro-

teins exhibited high similarity with at least one gene product in the nr database (80% or

7459 genes had top blastp hits with E-values equal to or less than 1E-20). Three species

exhibited the greatest sequence similarity to F. radiculosa. They were P. placenta, a

brown rot fungus, Serpula lacrymans, a dry rot fungus, and Laccaria bicolor, an ectomy-

corrhizal symbiont. The number of F. radiculosa genes that had top blastp hits to each

species was 2973, 2910, and 1000, respectively.

51

10,000

9,000

8,000

7,000

6,000

5,000

4,000

Number of Genes 3,000

2,000

1,000

0 annot no annot no no blastp total mapping hits

Blast2GO Analysis

Figure 2.9 Results of each stage of the Blast2GO analysis. The number of genes that were annotated (annot), had no annotations (no annot), no mapping, and no blastp hits. The total number of genes is also graphed.

The relationships and distribution of biological processes represented in the GO annotations were determined by constructing a directed acyclic graph. There were a total of 3916 genes that had terms in the biological process domain (level one). Terms were represented down to level six. Level three had the most terms, 15 total, and all had node scores >100 except signaling process and macromolecule localization. A pie chart of the level two terms is drawn in Figure 2.10. Most of the sequences had annotations for metabolic process (3072 genes) and cellular metabolic process (1702 genes). A number of genes had roles in localization (676 genes), biological regulation (495 genes), cellular component organization (295 genes), response to stimulus (246 genes), and signaling

(193 genes). The terms for developmental process and multicellular organismal process were also represented, but at lower levels, 128 and 103 genes, respectively).

52

signaling (193) response to stimulus developmental process (128) (246) multicellular organismal cellular component process (103) organization (295)

biological regulation (495)

localization (676) metabolic process (3072)

cellular process (1702)

Figure 2.10 Pie chart of the biological process terms found at level two of the directed acyclic graph. The number of genes associated with each term is enclosed in parentheses.

Wood Decay and Copper Tolerance Genes To further judge the value of this genomic resource, we mined the annotated data

for functions related to lignocellulose degradation and oxalate production for copper tol-

erance. We found a total of 187 genes (Table 2.5). The largest group (79 genes) were

the glycoside hydrolases (GH). There were 19 different GH families detected and most

(60% of the genes) carried signals for secretion. We found 23 genes with annotations for

sugar transport, 61%, of which, carried a motif for a signal peptide or a signal anchor.

Six of these transporters were identified as binding to hexose and one was a high affinity

glucose transporter. Genes for oxalate metabolism included 16 for oxalate biosynthesis

(citrate synthase, aconitase, isocitrate lyase, phosphoenolpyruvate mutase, malate syn- thase, malate dehydrogenase, and oxaloacetate hydrolase), 10 for oxalate catabolism (ox-

alate decarboxylase), and 8 that could be involved in oxalate transport (mono- and dicar-

53

boxylate transporters). Secretion signals were absent from the oxalate biosynthesis group and present in the oxalate decarboxylase group and transport groups. Proteins similar to the F. palustris oxalate efflux transporter were absent. We also detected 29 genes for

H2O2 metabolism, although many did not possess secretion signals. Regarding H2O2 production, there were 6 genes for alcohol oxidases (0 signal peptides), 4 aryl-alcohol oxidases (3 signal peptides), 4 copper radical oxidases (4 signal peptides), and 12 glu- cose-methanol-choline oxidoreductases (1 signal peptide). For H2O2 catabolism, there were 3 catalases (1 signal peptide). Our search for lignin modification genes revealed 5 genes: 2 laccases both with secretion signals, 2 multicopper oxidases (1 signal anchor), and 1 low redox potential peroxidase (1 signal peptide). When we mined the data for genes that could be involved in iron redox, we found 5 iron reductases (1 signal peptide and 2 signal anchors) and 2 iron permeases (1 signal anchor). Two genes (both carrying motifs for signal peptides) showed similarity to the P. chrysosporium glycoproteins 1 and

2 (E-value < 1E-50). Sequences showing similarity to the Gt factor (15 amino acids), however, were not identifiable due to the short length of the query sequence. Quinone reductases have also been implicated in iron redox cycling during brown rot decay

(Jensen et al. 2002; Varela and Tien 2003; Suzuki et al. 2006), but neither the quinone reductases (6 genes) nor the quinate permeases (2 genes) we found had signal peptides or signal anchors. Genes similar to cellobiose dehydrogenase, which can catalyze reduction of iron, quinones, O2, and other compounds in white rot fungi, were also absent.

54

Table 2.5 Summary of genes and their roles (italics) in lignocellulose degradation and copper tolerance (oxalate metabolism). Localization was predicted by the SignalP tool. GH#, glycoside hydrolase family number; SP, signal peptide; SA, signal anchor; mfs, major facilitator superfamily; gmc, glucose- methanol-choline.

Gene SP SA Glycoside hydrolases (n) (n) (n) GH1 2 0 0

GH2 4 2 0

GH3 7 3 0

GH5 14 7 0

GH10 3 3 0

GH12 2 2 0

GH16 15 11 0

GH27 2 0 0

GH28 9 7 0

GH35 2 0 1

GH43 3 3 0

GH51 3 0 0

GH53 1 1 0

GH55 3 2 0

GH61 2 2 0

GH79 2 1 0

GH88 2 1 0

GH105 1 1 0

GH115 2 2 0

Subtotal 79 48 1

55

Table 2.5 (continued)

Gene SP SA Sugar transport (n) (n) (n) hexose transporter 6 1 1

high affinity glucose transporter 1 0 0

mfs monosaccharide 16 7 5

Subtotal 23 8 6 Gene SP SA Oxalate metabolism (n) (n) (n) citrate synthase 5 0 0

aconitase 2 0 0

isocitrate lyase 2 0 0

phosphoenolpyruvate mutase 1 0 0

malate synthase 1 0 0

malate dehydrogenase 4 0 0

oxaloacetate hydrolase 1 0 0

oxalate decarboxylase 10 4 0

mono- and di-carboxylate transporters 8 3 0

Subtotal 34 7 0 Gene SP SA H O metabolism 2 2 (n) (n) (n) alcohol oxidase 6 0 0

aryl-alcohol oxidase 4 3 0

copper radical oxidase 4 4 0

gmc oxidoreductase 12 1 0

catalase 3 1 0

Subtotal 29 9 0

56

Table 2.5 (continued)

Gene SP SA Lignin modification (n) (n) (n) laccase 2 2 0

multicopper oxidase 2 0 1

peroxidase 1 1 0

Subtotal 5 3 1 Gene SP SA Iron redox (n) (n) (n) iron reductase 5 1 2

iron permease 2 1 0

quinone reductase 6 0 0

quinate permease 2 0 0 F. radiculosa sequences similar to P. chrysosporium 2 2 0 glycoproteins 1 and 2 Subtotal 17 4 2

Grand total 187 79 10

A closer inspection of the GH genes (Table 2.5) showed that the five families with the most genes and the most secretion signals belonged to GH16 (15 genes, 11 sig- nal peptides), GH5 (14 genes, 7 signal peptides), GH28 (9 genes, 7 signal peptides), GH3

(7 genes, 3 signal peptides), and GH10 (3 genes, 3 signal peptides). Although most of the

GH16 family has functions related to the remodeling of the "-1,3-glucans in the fungal cell wall, it can also hydrolyze plant "-glucans (endo-1,3(4)-"-glucanase). GH5 is a broad family that includes endo- and terminal hydrolysis of sugars like those found in hemicellulose and cellulose, e.g. "-mannosidase, cellulase, mannan endo-"-1,4- mannosidase, endo-"-1,4-xylanase, cellulose "-1,4-cellobiosidase, and endo-"-1,6- galactanase. Pectins are the likely target of the GH28 family of genes. GH28 family

57

members are polygalacturonase, exo-polygalacturonosidase, endo-xylogalacturonan hy- drolase, and rhamnogalacturonan #-L-rhamnopyranohydrolase. The GH3 family is a group that catalyzes hydrolysis of terminal sugars, such as "-glucosidase, xylan 1,4-"- xylosidase, and alph#-L-arabinofuranosidase. The GH10 family is primarily endo-1,4-"- xylanases, which could target endo-hydrolysis of the xylan hemicelluloses.

Discussion Our results showed that it was entirely feasible to produce a comprehensive set of structurally and functionally annotated genes for a basidiomycete fungus using only one

Illumina short-read sequencing run. To achieve our goal, we made several choices to maximize our chances of producing a high quality assembly. We removed the mitochon- dria so that sequencing coverage was not wasted on non-nuclear DNA. A paired-end strategy was used because it was known to increase the N50 by 3-fold compared to sin- gle-end reads in Escherichia coli (Chaisson et al. 2009). The rationale for selecting a 76 nt read length was because it approximated the 60 nt read-length barrier of Saccharomy- ces cerevisiae (Chaisson et al. 2009). Above the read length barrier, assemblies fail to improve, below the barrier, assemblies deteriorate (Chaisson et al. 2009). Furthermore, we selected Velvet because it was known to produce highly accurate assemblies

(Chaisson and Pevzner 2008; Diguistini et al. 2009; Nowrousian et al. 2010), even in the presence of 1% sequencing errors or single nucleotide polymorphisms (Zerbino and

Birney 2008). This was critical since short-read sequencing has a higher error rate than

Sanger methods and the DNA we sequenced was diploid.

Once we knew that the DNA in our dataset would assemble, we proceeded to de- velop a systematic method for refining the assembly. Our approach used step-wise filters

58

to create smaller, higher quality datasets. We found that the hash length, k, can have small, large, or unexpected effects on the N50 of the assembly depending upon the dataset and the value of k. Each dataset, however, was characterized by one assembly that exhibited a maximum N50 value and empirical testing of k is recommended (Zerbino

2008).

Having identified the maximum N50 assemblies for each unfiltered and filtered dataset, our next goal was to find the threshold for filtering. A high stringency threshold will cause reductions in N50 with only minor gains in accuracy, while a low stringency threshold will produce modest gains in N50 at the expense of accuracy. We used a Venn analysis to identify the reads in the dataset that were both filtered and used in the assem- bly and then assessed their quality. By performing this analysis, we showed that the best assembly from the four datasets (original + 3 filtered) was not the one with the largest maximum N50, but, rather, was the assembly with the maximum k or specificity.

Planning for high nucleotide coverage is also recommended. Not only can one expect to lose a significant number of reads to filtering, but Velvet itself will discard many high quality reads. These reads generally are from repeat regions that are longer than the read length. In S. macrospora, neither ribosomal nor mitochondrial DNA could be found in the Velvet assembly (Nowrousian et al. 2010). Both, however, were assem- bled from the unused reads using CodonCode Aligner (Nowrousian et al. 2010). We ex- pect a similar situation in our F. radiculosa assembly since we did not detect the ribo- somal DNA sequence in the assembly, but had 11.6 M high quality reads in the unused readset.

The gene prediction tool, GeneMark-ES v2, was shown to be accurate and sensi- tive when tested on the genomes of nine different fungal species (Ter-Hovhannisyan et al.

59

2008). For basidiomycetes, like F. radiculosa, the branch point sequence, which guides

lariat formation during splicing, is conserved (Kupfer et al. 2004). Because GeneMark

allows for the presence and absence of branch point sequence in the intron model, the al-

gorithm is better equipped to correctly locate the intron boundaries and misses fewer in-

trons. This version of GeneMark also predicts genes ab initio and the hidden Markov

model trains directly on the assembly, which means that we did not require an expressed

sequence tag training set.

A problem that exists with automated gene prediction software, however, is their

inability to predict genes from splice sites that use donor and acceptor sequences other

than GT-AG. The most commonly encountered non-canonical splice site sequence is

GC-AG. The frequency of GC-AG introns is 0.6% in Caenorhabditis elegans (Farrer et

al. 2002), 0.7% in mammals (Burset et al. 2000), and 1.0 to 1.2% in ascomycete fungi

(Rep et al. 2006). In Basidiomycota, genome surveys of non-canonical splice site fre-

quencies have not yet been reported. Initial data, however, suggests that the rate may be

higher. In 4 genes surveyed, 1/38 introns had a GC-AG splice site in Armillaria mellea

(Misiek and Hoffmeister 2008) compared to 1/18 introns with a GC-AG splice site in F.

radiculosa (J.D. Tang, unpublished). Another caveat to keep in mind is that sequencing

diploid DNA tends to overestimate the number of genes as alleles get assigned to differ- ent contigs. Pipelines are being developed to automate the process of detecting and cor- recting these false segmental duplications (Kelley and Salzberg).

Although we did not have any reference sequence to align against, our compari- son of F. radiculosa gene descriptive statistics with the version 2 assembly of the P. chrysosporium genome showed very close overall similarity. This was reassuring since two versions of the P. chrysosporium genome changed significantly with respect to num-

60

ber of genes, length of gene, CDS, and intron, and number of exons per gene (Vanden

Wymelenberg et al. 2006). Comparison to public databases provided additional evidence

that our gene predictions were reasonable. We found functional annotations for 58% of

the genes, and of the top protein matches, F. radiculosa exhibited the greatest number of

matches to P. placenta. Both are members of the family (Index

Fungorum 2008), are copper-tolerant brown rot fungi, and are known to secrete high

amounts of oxalate during wood decay (Green and Clausen 2003). The second most

matches were against the dry rot fungus, S. lacrymans. The latter causes dry rot, which is

a form of brown rot, is in the family Serpulaceae (Index Fungorum 2008), is not copper-

tolerant, and secretes moderate levels of oxalate during decay (Green and Clausen 2003).

Based on the number of top blastp hits, dry rot and brown rot fungi appear to share much

protein sequence similarity.

When we compare wood decay genes from F. radiculosa with P. placenta

(Martinez et al. 2009), we observed, for the most part, the same gene descriptions in both

species, with similar numbers of genes within each functional category. For example, both species showed a similar richness in the kinds of extracellular glycoside hydrolases represented. For the GH families listed in Table 2.5, F. radiculosa had 79 genes com- pared to 76 genes for P. placenta (9 GH families had the same number of genes, 8 dif- fered by 1 or 2 genes, and 2 differed by more than 2 genes). The top four GH families

(GH16, GH5, GH28, and GH3, ranked from more to less abundance) were also similar.

The other enzyme systems examined, like oxalate and H2O2 metabolism, lignin modifica- tion, and iron redox displayed similar trends. In addition, neither species had genes with significant similarity to the cellobiose dehydrogenase from P. chrysosporium or the ox- alate efflux transporter from F. palustris. Thus, the short-read sequencing strategy we

61

used in this paper seems to have produced similar results to the shotgun sequencing strat- egy used to sequence the genome of P. placenta (Martinez et al. 2009).

Knowing the genetic potential of an organism is important, but it is just the first step to understanding how cells function to create one particular phenotype of a particular organism. The real value of a genomic resource grows when it is used to track gene and protein expression at different developmental stages, under different physiological condi- tions, and in response to different environmental stimuli. Only then can we begin to de- tail the molecular mechanisms that control the biological processes of cell function. In particular, we hope to understand the genetic mechanisms that regulate wood decay and copper tolerance so that we can provide efficient and cost-effective options for biomass to biofuel conversion, protect wood from decay by copper-tolerant fungi, and develop methods for bioremediation of copper-treated wood.

62

References Cited

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: 3389-402.

Bentley, D.R., Balasubramanian, S., Swerdlow, H.P., Smith, G.P., Milton, J., Brown, C.G., Hall, K.P., Evers, D.J., Barnes, C.L., Bignell, H.R., Boutell, J.M., Bryant, J., Carter, R.J., Keira Cheetham, R. 2008. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456: 53-9.

Burset, M., Seledtsov, I.A., Solovyev, V.V. 2000. Analysis of canonical and non- canonical splice sites in mammalian genomes. Nucleic Acids Res. 28: 4364-75.

Cantarel, B.L., Coutinho, P.M., Rancurel, C., Bernard, T., Lombard, V., Henrissat, B. 2009. The Carbohydrate-Active EnZymes database (CAZy): an expert resource for glycogenomics. Nucleic Acids Res. 37: D233-8.

Chaisson, M.J., Brinza, D., Pevzner, P.A. 2009. De novo fragment assembly with short mate-paired reads: Does the read length matter? Genome Res. 19: 336-46.

Chaisson, M.J., Pevzner, P.A. 2008. Short read fragment assembly of bacterial genomes. Genome Res. 18: 324-30.

Diguistini, S., Liao, N.Y., Platt, D., Robertson, G., Seidel, M., Chan, S.K., Docking, T.R., Birol, I., Holt, R.A., Hirst, M., Mardis, E., Marra, M.A., Hamelin, R.C., Bohlmann, J., Breuil, C., Jones, S.J. 2009. De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data. Genome Biol. 10: R94.

DiGuistini, S., Wang, Y., Liao, N.Y., Taylor, G., Tanguay, P., Feau, N., Henrissat, B., Chan, S.K., Hesse-Orce, U., Alamouti, S.M., Tsui, C.K., Docking, R.T., Levasseur, A., Haridas, S., Robertson, G., Birol, I., Holt, R.A., Marra, M.A., Hamelin, R.C., Hirst, M., Jones, S.J., Bohlmann, J., Breuil, C. 2011. Genome and transcriptome analyses of the mountain pine beetle-fungal symbiont Grosmannia clavigera, a lodgepole pine pathogen. Proc. Natl. Acad. Sci. USA 108: 2504-9.

Emanuelsson, O., Brunak, S., von Heijne, G., Nielsen, H. 2007. Locating proteins in the cell using TargetP, SignalP and related tools. Nat. Protoc. 2: 953-71.

Farrer, T., Roller, A.B., Kent, W.J., Zahler, A.M. 2002. Analysis of the role of Caenorhabditis elegans GC-AG introns in regulated splicing. Nucleic Acids Res. 30: 3360-7.

63

Gotz, S., Garcia-Gomez, J.M., Terol, J., Williams, T.D., Nagaraj, S.H., Nueda, M.J., Robles, M., Talon, M., Dopazo, J., Conesa, A. 2008. High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res. 36: 3420-35.

Green, F., Clausen, C.A. 2003. Copper tolerance of brown-rot fungi: time course of oxalic acid production. Int. Biodeterior. Biodegrad. 51: 145-49.

Illumina 2008. Preparing samples for paired-end sequencing: instruction manual. San Diego, CA, Illumina.

Index Fungorum. 2008. Retrieved Aug. 20, 2011 from http://www.indexfungorum.org/Names/Names.asp.

Jensen, K.A., Ryan, Z.C., Wymelenberg, A.V., Cullen, D., Hammel, K.E. 2002. An NADH:quinone oxidoreductase active during biodegradation by the brown-rot basidiomycete Gloeophyllum trabeum. Appl. Environ. Microbiol. 68: 2699-703.

Kelley, D.R., Salzberg, S.L. Detection and correction of false segmental duplications caused by genome mis-assembly. Genome Biol. 11: R28.

Kupfer, D.M., Drabenstot, S.D., Buchanan, K.L., Lai, H., Zhu, H., Dyer, D.W., Roe, B.A., Murphy, J.W. 2004. Introns and splicing elements of five diverse fungi. Eukaryot. Cell 3: 1088-100.

Li, B., Nagalla, S.R., Renganathan, V. 1997. Cellobiose dehydrogenase from Phanerochaete chrysosporium is encoded by two allelic variants. Appl. Environ. Microbiol. 63: 796-9.

Li, R.Fan, W.Tian, G.Zhu, H.He, L.Cai, J.Huang, Q.Cai, Q.Li, B.Bai, Y.Zhang, Z.Zhang, Y.Wang, W.Li, J.Wei, F.Li, H.Jian, M.Nielsen, R.Li, D.Gu, W.Yang, Z.Xuan, Z.Ryder, O.A.Leung, F.C.Zhou, Y.Cao, J.Sun, X.Fu, Y.Fang, X.Guo, X.Wang, B.Hou, R.Shen, F.Mu, B.Ni, P.Lin, R.Qian, W.Wang, G.Yu, C.Nie, W.Wang, J.Wu, Z.Liang, H.Min, J.Wu, Q.Cheng, S.Ruan, J.Wang, M.Shi, Z.Wen, M.Liu, B.Ren, X.Zheng, H.Dong, D.Cook, K.Shan, G.Zhang, H.Kosiol, C.Xie, X.Lu, Z.Li, Y.Steiner, C.C.Lam, T.T.Lin, S.Zhang, Q.Li, G.Tian, J.Gong, T.Liu, H.Zhang, D.Fang, L.Ye, C.Zhang, J.Hu, W.Xu, A.Ren, Y.Zhang, G.Bruford, M.W.Li, Q.Ma, L.Guo, Y.An, N.Hu, Y.Zheng, Y.Shi, Y.Li, Z.Liu, Q.Chen, Y.Zhao, J.Qu, N.Zhao, S.Tian, F.Wang, X.Wang, H.Xu, L.Liu, X.Vinar, T.Wang, Y.Lam, T.W.Yiu, S.M.Liu, S.Huang, Y.Yang, G.Jiang, Z.Qin, N.Li, L.Bolund, L.Kristiansen, K.Wong, G.K.Olson, M.Zhang, X.Li, S., Yang, H. 2010. The sequence and de novo assembly of the giant panda genome. Nature 463: 311-7.

64

Martinez, D., Challacombe, J., Morgenstern, I., Hibbett, D., Schmoll, M., Kubicek, C.P., Ferreira, P., Ruiz-Duenas, F.J., Martinez, A.T., Kersten, P., Hammel, K.E., Vanden Wymelenberg, A., Gaskell, J., Lindquist, E., Sabat, G., Bondurant, S.S., Larrondo, L.F., Canessa, P., Vicuna, R., Yadav, J., Doddapaneni, H., Subramanian, V., Pisabarro, A.G., Lavin, J.L., Oguiza, J.A., Master, E., Henrissat, B., Coutinho, P.M., Harris, P., Magnuson, J.K., Baker, S.E., Bruno, K., Kenealy, W., Hoegger, P.J., Kues, U., Ramaiya, P., Lucas, S., Salamov, A., Shapiro, H., Tu, H., Chee, C.L., Misra, M., Xie, G., Teter, S., Yaver, D., James, T., Mokrejs, M., Pospisek, M., Grigoriev, I.V., Brettin, T., Rokhsar, D., Berka, R., Cullen, D. 2009. Genome, transcriptome, and secretome analysis of wood decay fungus Postia placenta supports unique mechanisms of lignocellulose conversion. Proc. Natl. Acad. Sci. USA 106: 1954-59.

Misiek, M., Hoffmeister, D. 2008. Processing sites involved in intron splicing of Armillaria natural product genes. Mycol. Res. 112: 216-24.

Nowrousian, M., Stajich, J.E., Chu, M., Engh, I., Espagne, E., Halliday, K., Kamerewerd, J., Kempken, F., Knab, B., Kuo, H.C., Osiewacz, H.D., Poggeler, S., Read, N.D., Seiler, S., Smith, K.M., Zickler, D., Kuck, U., Freitag, M. 2010. De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis. PLoS Genet. 6: e1000891.

Paterson, A.H., Brubaker, C.L., Wendel, J.F. 1993. A rapid method for extraction of cotton (Gossypium spp.) genomic DNA suitable for RFLP or PCR analysis. Plant Mol. Biol. Reporter 11: 122-27.

R Core Development Team. 2011. R: A language and environment for statistical computing. Retrieved Mar. 10, 2011 from http://www.R-project.org.

Rep, M., Duyvesteijn, R.G., Gale, L., Usgaard, T., Cornelissen, B.J., Ma, L.J., Ward, T.J. 2006. The presence of GC-AG introns in Neurospora crassa and other euascomycetes determined from analyses of complete genomes: implications for automated gene prediction. Genomics 87: 338-47.

Schwartz, R.L., Phoenix, T. 2001. Learning Perl. Cambridge, O'Reilly.

Sequence Ontology Project. 2008. gtf2gff3 Perl script. Retrieved Jan. 5, 2010 from http://www.sequenceontology.org/cgi-bin/converter.cgi

Stajich, J.E., Block, D., Boulez, K., Brenner, S.E., Chervitz, S.A., Dagdigian, C., Fuellen, G., Gilbert, J.G., Korf, I., Lapp, H., Lehvaslaiho, H., Matsalla, C., Mungall, C.J., Osborne, B.I., Pocock, M.R., Schattner, P., Senger, M., Stein, L.D., Stupka, E., Wilkinson, M.D., Birney, E. 2002. The BioPerl toolkit: Perl modules for the life sciences. Genome Res. 12: 1611-18.

Suzuki, M.R., Hunt, C.G., Houtman, C.J., Dalebroux, Z.D., Hammel, K.E. 2006. Fungal hydroquinones contribute to brown rot of wood. Environ. Microbiol. 8: 2214-23.

65

Tanaka, H., Yoshida, G., Baba, Y., Matsumura, K., Wasada, H., Murata, J., Agawa, M., Itakura, S., Enoki, A. 2007. Characterization of a hydroxyl-radical-producing glycoprotein and its presumptive genes from the white-rot basidiomycete Phanerochaete chrysosporium. J. Biotechnol. 128: 500-11.

Ter-Hovhannisyan, V., Lomsadze, A., Chernoff, Y.O., Borodovsky, M. 2008. Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. Genome Res. 18: 1979-90.

Vanden Wymelenberg, A., Minges, P., Sabat, G., Martinez, D., Aerts, A., Salamov, A., Grigoriev, I., Shapiro, H., Putnam, N., Belinky, P., Dosoretz, C., Gaskell, J., Kersten, P., Cullen, D. 2006. Computational analysis of the Phanerochaete chrysosporium v2.0 genome database and mass spectrometry identification of peptides in ligninolytic cultures reveal complex mixtures of secreted proteins. Fungal Genet. Biol. 43: 343- 56.

Varela, E., Tien, M. 2003. Effect of pH and oxalate on hydroquinone-derived hydroxyl radical formation during brown rot wood degradation. Appl. Environ. Microbiol. 69: 6025-31.

Wang, W., Huang, F., Mei Lu, X., Ji Gao, P. 2006. Lignin degradation by a novel peptide, Gt factor, from brown rot fungus Gloeophyllum trabeum. Biotechnol. J. 1: 447-53.

Watanabe, T., Shitan, N., Suzuki, S., Umezawa, T., Shimada, M., Yazaki, K., Hattori, T. 2010. Oxalate efflux transporter from the brown rot fungus Fomitopsis palustris. Appl. Environ. Microbiol. 76: 7683-90.

White, T.J., Bruns, S.L., Taylor, J. 1990. Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In: PCR protocols: a guide to methods and applications, eds. Innis, M.A., Gelfand, D.H., Sninsky, J.J., White, T.J. Academic Press, San Diego, CA, pp. 315-22.

Wostemeyer, J., Kreibich, A. 2002. Repetitive DNA elements in fungi (Mycota): impact on genomic architecture and evolution. Curr. Genet. 41: 189-98.

Zerbino, D. 2008. Velvet manual version 0.7. August 29, 2008. Retrieved Mar. 3, 2010 from http://www.ebi.ac.uk/~zerbino/velvet/.

Zerbino, D.R., Birney, E. 2008. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18: 821-29.

66

CHAPTER III

TRANSCRIPTOMIC ANALYSIS OF THE BROWN ROT FUNGUS

FIBROPORIA RADICULOSA ON WOOD TREATED

WITH A COPPER-BASED

PRESERVATIVE

Abstract Many brown rot fungi are capable of rapidly degrading wood and are copper- tolerant. To better understand the genes that control these processes, we examined gene expression of Fibroporia radiculosa growing on wood treated with a copper-based pre- servative that combined copper carbonate with dimethyldidecylammonia carbonate. A global profiling strategy called RNA-Seq was used to quantify gene expression of the fungus at days 31 and 154. At day 31, the preservative was still protecting the wood, which showed no strength loss. At day 154, the effects of the preservative were gone, and the wood exhibited 52% strength loss. Statistical analysis identified 917 genes that were differentially expressed (FDR < 1E-4). Transcripts that showed higher expression levels at the early time point were controlling increased oxalate metabolism, laccase for hydroquinone-driven hydroxyl free radical production, pectin degradation, ATP produc- tion, xenobiotic detoxification, copper resistance, and stress response. Transcripts that showed higher expression levels at the late time point were involved in degradation of cellulose, hemicellulose, and pectin, hexose transport, oxalate catabolism, catabolism of laccase substrates, extracellular proton reduction, and remodeling of the fungal cell

67

membrane and cell wall to enhance survival at low pH. A total of 108 differentially ex- pressed genes were discussed in relation to their roles in wood decay and copper toler- ance.

Introduction Among decay fungi, brown rot fungi have evolved the ability to rapidly depolym-

erize cellulose during the early stages of decay (Curling et al. 2002; Daniel 2003). These

attributes stem from their unique ability to use oxidative mechanisms to loosen the ligno-

cellulose matrix, which gives the hydrolytic enzymes access to the structural polysaccha-

rides so they can be degraded to their composite sugars. Wood oxidation has been attrib-

2+ uted to the action of hydroxyl free radicals generated by the Fenton reaction (Fe + H2O2

+ 3+ + H ! Fe + •OH + H2O) (Koenigs 1974; Kirk et al. 1991; Jensen et al. 2001; Arantes et al. 2009a). These low molecular weight substrates are small enough to diffuse through the tight lignocellulose matrix where they begin the decay process in regions that are not in direct contact with the hyphae (Daniel 2003). The net effect of •OH attack on the lignocellulose matrix is an increase in the pore size (Flournoy et al. 1991; Irbe et al.

2006) and surface area that the extracellular hemicellulases, cellulases, and other carbo- hydrate active enzymes can act upon (Cohen et al. 2005; Valaskova and Baldrian 2006).

Over time, the rotted wood develops a dark brown, checked appearance, which is a resi- due of modified lignin that is left behind because brown rot fungi, unlike white rot fungi, lack the ability to mineralize lignin (Cowling 1961).

To orchestrate the decay process, brown rot fungi control the timing and produc- tion of a diverse number of supporting molecules. The presence of elevated levels of ex- tracellular oxalate (Green et al. 1991; Ritschkoff et al. 1995; Varela and Tien 2003;

68

Schilling and Jellison 2005) and H2O2 (Highley 1987; Ritschkoff et al. 1995; Kim et al.

2002) during brown rot decay has been well documented for many species. H2O2 is a

Fenton reactant so its role in oxidative decay is clear, but the significance of oxalic acid is more complex and elusive. Oxalic acid, which has pKa values of 1.38 and 4.28, contrib- utes to the drop in substrate pH that typically occurs during colonization by the fungus.

The pH typically ranges from 2.0 to 5.0, depending upon the fungal species and the sub- strate being colonized (Green et al. 1991; Dutton et al. 1993; Humar et al. 2001; Schilling and Jellison 2005, 2006).

Oxalic acid also has roles in solubilizing and chelating metals. Its ability to chelate metal in different stoichiometric ratios, however, affects how tightly the iron is bound by oxalate in the complex. Thus, the metal can be bound reversibly or irreversi- bly. In addition, it has been determined that higher oxalate:metal ratios require com- pounds with greater reducing potentials to reduce the iron. During wood decay, there are three proposed iron reductants: hydroquinones (Kerem et al. 1999; Suzuki et al. 2006), phenolate compounds (Goodell et al. 1997; Arantes et al. 2011) and low molecular weight peptides (Wang et al. 2006). Currently, the interaction of the iron reductant at dif- ferent pH values and oxalate concentrations has only been studied for the hydroquinones

(Varela and Tien 2003) and the phenolates (Arantes et al. 2009b). Results showed that at lower oxalate concentrations and higher pH, the lower the ratio of oxalate:iron in the complex. The interpretation is that as the complex diffuses along the oxalate and pH concentration gradients away from the hyphae, the iron is bound less tightly, increasing its chances of being reduced by the reducing agent. Furthermore, at some point, binding sites in the wood show greater affinity for the iron and the iron moves from the oxalate complex to the wood (Hammel et al. 2002; Arantes et al. 2009b). Because of these

69

chemical relationships, the hyphae, which reside in the tracheid lumen, are protected

from oxidative damage. Scission only occurs deep within the thick S2 layer of the sec-

ondary cell wall (Daniel 2003; Schwarze 2007), which is about 54% cellulose (Rowell

2+ 2005). Here, the Fe reacts with H2O2 to form the highly reactive but short-lived •OH

that randomly cleaves the cellulose as well as other components of the lignocellulosic

matrix (Arantes et al. 2011).

Many studies have shown that at toxic or abnormally high metal concentrations,

many fungi respond by irreversibly binding and precipitating the excess metal in the form

of metal oxalate crystals (Jarosz-Wilkolazka and Gadd 2003; Schilling and Jellison 2004;

Fomina et al. 2005). Increased oxalate levels (Clausen and Green 2003; Green and

Clausen 2003) and copper oxalate precipitation (Clausen et al. 2000) is also the generally

accepted mechanism for copper tolerance in several species of brown rot fungi. Even in

the absence of toxic metal concentrations, though, many brown rot fungi continually se- crete oxalate as part of their basal metabolism (Takao 1965; Connolly et al. 1996). Under these conditions, the fungi typically produce calcium oxalate crystals, since calcium tends to be a readily available metal. Formation of the calcium oxalate crystals also suggest that oxalate becomes toxic if it is not continually removed.

An important aspect raised by investigators that study hydroquinone-driven hy- droxyl free radical production is that in situ oxalate concentrations for most oxalate pro- ducing brown rot fungi are generally high enough to prevent iron reduction by the hydro- quinone in the iron oxalate complex (Wei et al. 2010). Recent genomic sequencing stud- ies, however, have identified laccase genes in two species of brown rot fungi, Postia pla- centa (Martinez et al. 2009) and Fibroporia radiculosa (Tang et al. 2010). Activity of a purified laccase enzyme was also demonstrated in P. placenta by heterologous expression

70

of a laccase cDNA (Wei et al. 2010). The importance of laccase is that it can catalyze the

oxidation of hydroquinones and phenolates (Gianfreda et al. 1999), both of which are

compounds that are detected during decay (Suzuki et al. 2006; Arantes et al. 2011), to

their corresponding free radicals. The latter have more energy than the parent compound

and, at least for the semiquinones, theoretically have a high enough reducing potential to

reduce iron in the iron oxalate complex (Wei et al. 2010). The series of eight reactions

2+ that generate Fe and H2O2 for a complete Fenton system are detailed in Wei et al.

(2010).

The role of oxalate as a source of biochemical energy during brown rot decay has also been elucidated. Using enzyme assays, Munir et al. (2001) described the steps in the oxalate biosynthetic pathway and were able to demonstrate that the tricarboxylic acid

(TCA) and the glyoxylate (GLOX) cycles were metabolically coupled through the action of isocitrate lyase in the brown rot fungus, Fomitopsis palustris. Clear demonstration of the genes that regulate the GLOX cycle in brown rot fungi, however, are lacking

(Martinez et al. 2009). In addition, although a > 2-fold increase of laccase gene expres- sion was demonstrated in liquid cultures of P. placenta on balled milled aspen compared to balled milled pine, no concurrent increases in expression of genes for oxalate produc- tion was described (Vanden Wymelenberg et al. 2010). Therefore, the in vivo role of lac- case for producing Fenton reactants during periods of high oxalate production was un- clear.

Global sequencing strategies, like RNA-Seq have been very successful for quanti- fying the transcriptome or all the RNA that is being expressed at a particular time in a specific cell type (Mortazavi et al. 2008; Wang et al. 2009). In this study, we used RNA-

Seq to identify the genes that regulate oxalate production and degradation of the struc-

71

tural polysaccharides during decay of solid wood treated with a copper-based wood pre- servative. Our hypothesis was that expression of genes related to high oxalate production and copper precipitation would be up-regulated early, when wood showed no strength loss, while the expression of genes degrading the structural polysaccharides would be up- regulated late, when wood exhibited high strength loss. Our results were impressively informative identifying many key genes that regulate degradation of wood polysaccha- rides, oxalate metabolism, Fenton chemistry, copper resistance, response to stress, xeno- biotic degradation, and changes in membrane lipid composition to enhance survival in low pH environments.

Materials and Methods

Fungus Isolate F. radiculosa strain TFFH 294 was generously donated by Carol Clausen, USDA

Forest Service, Forest Products Laboratory, Madison WI.

Decay Tests The decay rate of southern yellow pine wafers (n=5) was evaluated by loss of compression strength over time using a modified accelerated soil block test E22-09

(AWPA 2009). Wood wafers (54 x 18 x 5 mm) were pressure-treated with 0.8% (wt/wt) micronized copper quaternary ammonia compound (MCQ) to an average retention (+

SD) of 5.47 + 0.12 kg MCQ/m3. The MCQ was a 2:1 (wt:wt) formulation of

CuCO3:DDAC (dimethyldidecylammonium carbonate). The percent active ingredient for

the CuCO3 and the DDAC stock solutions were 41% and 47%, respectively. All materi-

als used in the soil block test containers were sterilized prior to use. The containers held

two southern yellow pine feeder strips that were laid into the container with the flat side 72

in contact with a bed of moist soil. The feeder was inoculated with the fungus and al-

lowed to grow at 28°C until the feeder was fully colonized (about 2 wks). The treated

wafer was then laid cross-wise onto the colonized feeder strips. After 25, 31, 46, 63, 70,

and 154 days of exposure to the fungus, the test wafer was removed, cut to 18 x 18 x 5

mm and used to measure strength loss. Percent loss of compression strength was calcu-

lated relative to unexposed treated wafers cut from the same board. On days 31 and 154,

about 2/3 of the wafer was used for the RNA isolation.

RNA and Reverse Transcription (RT) Products Our RNA isolation protocol used the Ambion RNaqueous Kit (Ambion; Austin,

TX) combined with on-column DNase I digestion (Qiagen 2010). Samples were kept on

ice and handling time minimized to prevent as much RNA degradation as possible. The

treated wafer was rasped in 5 x 0.2 g portions. As each portion was generated, it was

flash frozen in liquid N2 then stored at -80°C. RNA was released from the sample by

adding the kit denaturation solution, followed by beating (2 x 3 min separated by 3 min

rest on ice) in a Minibeadbeater 16 (Bio Spec Products; Bartlesville, OK). Spectropho-

tometric determination on a Nanodrop 1000 (Thermo Fisher Scientific; Pittsburg, PA)

showed that our yields of total RNA (+ SD) for the day 31 and day 154 samples were, respectively, 3.5 + 1.2 and 18.3 + 4.2 ug/g rasped wood. The quality of extracted RNA was determined by Experion chip electrophoresis (RNA StdSens Analysis Kit, Bio-Rad;

Hercules, CA).

We also the cloned and sequenced the RT-PCR products from the 5'- and 3'- ends of four wood decay genes (a multicopper oxidase, a laccase, and two copper radical oxi- dases). The amplified RT product was gel purified and cloned into the pGEM T-Easy

73

vector system (Promega; Madison, WI). Three clones from each cloning reaction were picked for isolation of the recombinant plasmid DNA. After cutting the plasmid DNA with EcoRI, the size of the restriction fragment was determined by gel electrophoresis.

Clones that produced fragments with the correct size based on the predicted coding se- quence were selected for sequencing. The inserts were sequenced in both forward and reverse directions by capillary electrophoresis on a Beckman CEQ 8000 (Beckman Coul- ter, Brea, CA). The CEQ 8000 uses fluorescently-labeled dideoxy terminators to label each nucleotide sequenced. These eight RT products spanned a total of 18 introns.

RNA-Seq Libraries RNA (5 ug) from each sample was used to generate libraries according to the pro- tocol provided with the Illumina mRNA-Seq Sample Prep Kit (Illumina; San Diego, CA).

Briefly, the protocol had 9 steps: selection of mRNA with oligo d(T) magnetic beads, fragmentation, 1st strand cDNA synthesis with random hexanucleotide oligos, 2nd strand cDNA synthesis, end repair, 3' adenylation of fragment ends, ligation of a forked adapter to each end, size selection, and PCR enrichment of fragments that have an adapter on each end. The library concentration and fragment size of each library was determined by

Experion chip electrophoresis (DNA 1K Kit, Bio-Rad; Hercules, CA). The results showed that the library concentrations ranged from 11 to 102 nM and each library con- tained a peak fragment size between 363 to 398 bp. All libraries were diluted to 10 nM.

Short-read sequencing of the libraries were done in 7 lanes on one single-end flow cell run of an Illumina GAIIx instrument (76 nt read length; day 31, n = 3; day 154, n = 4) with one lane reserved for the instrument control sample. Raw sequence data was proc- essed using Firecrest (image analysis) and Bustard (basecalling) as part of the Illumina

74

GAPipeline v1.4.0. The data obtained was in a colon-delimited SCARF format (Solexa

Compact ASCII Read Format), which included the information for each read on one line

separated by colons (sequencer name, lane number, tile number, x and y cluster coordi-

nates, paired-read designation, base calls, and ASCII quality scores).

Alignment to the Predicted Coding sequence (CDS) SCARF files were converted to FASTQ format then trimmed and filtered to re-

move poor quality sequence according to the following criteria: bases with "B" quality

scores were trimmed, reads that had < 38 bases left after trimming were discarded, and

any read that began with an "N" was discarded. An ultrafast alignment tool, called Bow- tie v0.12.7 (Langmead et al. 2009) was used to map the filtered FASTA dataset against the predicted CDS obtained from our genomics analysis (Tang et al. 2010). The refer- ence CDS database had 5407 gene ontology (GO) annotations from a total of 9262 se- quences (Tang et al. 2010). It should be noted that because the genome was assembled from heterokaryotic DNA (2N), it is possible that some of the genes we refer to may be alleles of the same gene. In addition, all CDS sequences were predicted ab initio using gene prediction software and their individual sequences have not been verified by cDNA cloning. Likewise, functions were inferred from electronic annotation and have not been confirmed empirically. Bowtie options were set to limit the output to unique alignments and a maximum of 2 mismatches were allowed in the first 28 bases of the read. The number of alignments to each CDS was then tallied.

Identification of Differentially Expressed Genes Analyses were done using the edgeR package (Robinson and Oshlack 2010;

Robinson et al. October 2010, posting date) in Bioconductor. Outlier replicates were

75

identified by a multidimensional scaling plot. MA plots of each pair-wise sample combi-

nation were used to determine the effects of normalization. M is the log of the concentra-

tion ratio and A is the average of the two log concentrations. In MA plots, concentration

is an estimate of the concentration of the transcript in the original samples in the com-

parison. Prior to the differential gene expression analysis, genes that had count sums less

than 10 were removed from the dataset. The distribution of the genes that were differen- tially expressed were visualized in a plot of log fold change vs log concentration. In the latter case, concentration is an estimate of the concentration of a transcript across all samples. We selected edgeR because it adds a proportionality factor to the normalization scheme, which helps to offset the compositional differences of the RNA being sampled.

EdgeR also estimates a common dispersion parameter from pseudocount data (generated under the hypothesis that the means are different) using a quantile-to-quantile method for a negative binomial distribution. This parameter compensates for small sample size. For

differential expression analysis, edgeR includes an exactTest function that computes an

exact p-value for the log transformed normalized data in a manner very similar to Fisher's

exact test. Since multiple testing increases the Type I or false positive error rate, the

method of Benjamini and Hochberg (Benjamini and Hochberg 1995) was employed to

control the false discovery rate (FDR), which was to set to a threshold level of 1E-4.

Mining for Genes Related to Decay of Preservative-Treated Wood The Blast2GO suite (v 2.4.9) was used to determine if there were significant dif-

ferences in the GO terms represented between the early and late differentially expressed

genes. The enrichment analysis was performed using Fisher's exact test with correction

for multiple hypothesis testing (FDR < 0.05) (Benjamini and Hochberg 1995), then re-

76

sults were reduced to the most specific GO terms found. The functional description of genes in the early and late groups were examined and the genes that were related to inac- tivation of the preservative treatment and loss of wood strength were discussed in the re- sults. We also searched the list of differentially expressed genes identified by edgeR for keywords specifically related to the hydrolytic breakdown of pectin, hemicellulose, and cellulose (glycoside hydrolases), sugar transport, copper tolerance (genes involved in ox- alate metabolism), lignin modification (laccases and multicopper oxidases), and oxidative breakdown of wood by the Fenton reaction (genes involved in H2O2 metabolism, iron re- duction, and quinone redox cycling). To better distinguish between the extracellular ac- tivities of glycoside hydrolases involved in wood decay versus fungal cell wall re- modeling, we referred to the functional descriptions found for the glycoside hydrolase families in the Carbohydrate-Active enZymes Database (available at http://www.cazy.org) (Cantarel et al. 2009). Only those glycoside hydrolase genes that had functions limited to wood decay were reported. If the original annotation only gave the glycoside hydrolase family (Tang et al. 2010), then blastp analysis (Altschul et al.

1997) of the predicted protein against the NCBI nr protein database taxon Fungi (avail- able at http://blast.ncbi.nlm.nih.gov) was performed. This often identified a more spe- cific function within the glycoside hydrolase family, which was reported in Table 3.

Relatively short sequences were considered to be gene fragments and were excluded in the interpretation of our results.

77

Computational Analysis All software tools mentioned were in the public domain. Custom Perl (Schwartz and Phoenix 2001) and BioPerl (Stajich et al. 2002) scripts were written by J.D. Tang un- less otherwise specified.

Results of the Study

Soil Block Test During the first 70 days of fungus exposure in the accelerated soil block tests, wood wafers treated with MCQ showed no appreciable loss in compression strength

(Figure 3.1.A). After 154 days of exposure, however, the wood wafers showed 52% strength loss (Figure 3.1.A). RNA-Seq libraries were prepared from the day 31 and 154 time points because of the differences in strength loss and because both time points pro- duced adequate yields of RNA. Earlier than 31 days, and the RNA yields tended to be low. The fungus also showed distinct morphological changes between the two time points when RNA samples were taken (Figure 3.1.B and 3.1.C). At day 31, the hyphae were white, highly branched, and exhibited radial growth patterns (Figure 3.1.B). At day

154, the hyphae were more yellow-colored, less branched, and displayed more linear growth patterns (Figure 3.1.C).

78

A 70

60

50

40

30

20

10

Percent Strength Loss 0

-10 20 40 60 80 100 120 140 160

Fungal Exposure (Days)

B C

Figure 3.1 Percent compression strength loss (+ SE) of MCQ-treated wafers after ex- posure to F. radiculosa in an accelerated soil block test (A). RNA-Seq li- braries were prepared from samples taken at day 31 (B) and day 154 (C). Arrows point to the MCQ-treated wafer, which was laid on the feeder strips in the soil block test.

RNA and RT Products Representative electropherograms of RNA isolated from days 31 (Figure 3.2.A) and 154 (Figure 3.2.B) wafers showed little degradation. The two tall peaks eluting near

40 and 45 s were the major ribosomal RNAs (18S and 28S, respectively). Degraded

RNA would have displayed a broad peak caused by random fragmentation of the ribo- somal RNA. By cloning the RT-PCR products associated with the 5'- and 3'- ends of four

genes, we determined that the RNA did not contain inhibitors that might interfere with

downstream reactions. In addition, although amplified RT-PCR products showed 100%

sequence identity to the genomic sequence, we found that 1/18 introns was missed by the

79

gene prediction algorithm because it was bordered by a non-canonical splice site (GC-

AG).

A

B

Figure 3.2 Typical electropherograms of RNA isolated from day 31 (A) and day 154 (B) MCQ-treated wafers.

Bowtie Alignments After filtering, the number of reads per sample ranged from 28 to 33 M. Of these,

17 to 21 M or 61 to 64% aligned to only one location in the reference CDS database. The remaining reads were not included in our gene expression analysis because they either had more than one alignment to the CDS (0.14 to 0.27%) or failed to align (36 to 39%).

Reasons why a read might fail to map include: too many mismatches, alignment to a CDS region that contained an "N" nucleotide, or partial alignment to the ends of a CDS. Reads

80

that spanned a non-canonical splice junction would also fail to align due to the number of mismatches.

Identification of Differentially Expressed Genes The multidimensional scaling plot (Figure 3.3) showed that the early group was more tightly clustered than the late group, which exhibited spread across the second di- mension. Regardless, the scale for each dimension encompassed a narrow range, so none of the samples could be classed as outliers. Inspection of each MA plot indicated suc- cessful normalization and no bias in the data. Representative MA plots (E1 versus L1) show the distribution of the points before (Figure 3.4.A) and after (Figure 3.4.B) nor- malization. Before normalization, the red line or the estimated trimmed mean of the M values deviated from the M = 0 line, indicating that the two datasets did not have similar distributions. After normalization, the points were shifted down by the appropriate dis- tance to account for the compositional bias in the library size. After normalization, the red line was at M = 0 but was not plotted so that the shift in point distribution could be viewed.

81

L4 0.2

L3 0.1 E2

E3 0.0

Dimension 2 E1 -0.1

L1 -0.2

L2

-0.5 0.0 0.5

Dimension 1

Figure 3.3 Multidimensional scaling plot used to identify outlier samples. E1 - E3 and L1 - L4 were the early and late replicates, respectively.

82

Figure 3.4 MA plot of E1 versus L1 before (A) and after normalization (B). E1 and L1 refer to the first replicate of the early and late samples, respectively. The red line was the estimated trimmed mean of M values and was the ad- justment applied to account for the compositional bias in the library size. The orange points had a zero or low count number and were artificially rep- resented at the left edge of the graph.

A plot of log fold change versus log concentration showed the distribution of the

9262 transcripts being compared between the early and late time points (Figure 3.5).

There were 917 genes that showed significant differences in their log fold change expres- sion levels (red points, FDR 1E-4; Figure 3.5). Among these, 463 transcripts exhibited 83

increased expression levels at the early time point and had negative log fold change val-

ues (average fold change, 4.3x; maximum fold change, 58x; minimum fold change, 2.3x).

The remaining 454 genes exhibited increased expression levels at the late time point and had positive log fold change values (average fold change, 5.7x; maximum fold change,

62x; minimum fold change, 2.4x).

Figure 3.5 Plot of log fold change (logFC) versus log concentration (logConc). Genes that showed increased expression at the early (E) and late (L) time points had negative and positive logFC values, respectively. The blue lines marked fold change values of 4. Significant logFC values were colored red (FDR < 1E-4). LogConc values close to 0 were highly abundant across all samples. Three points with one or more zero count values (orange) were artificially represented at the left edge of the graph.

84

Mining for Genes Related to Decay of Preservative-Treated Wood A bar chart of the differential GO analysis is shown in Figure 3.6. This bar chart displays the percentage of genes that showed significant differences in GO term abun- dance between the highly expressed early and late groups. The highly expressed early and late groups contained the genes that were identified by the edgeR analysis. They showed significant fold changes in expression between days 31 and 154, respectively. In the GO domain molecular function (Figure 3.6.A), the highly expressed early group had significantly fewer percent genes with the GO term hydrolase activity (14%) than the highly expressed late group (27%) (FDR=0.002). The GO term binding, however, showed greater representation in the highly expressed early group (62%) compared to the highly expressed late group (48%) (FDR=0.03). In the GO domain biological process

(Figure 3.6.B), a smaller percentage of genes from the highly expressed early group were associated with the GO term lipid metabolic process (1%) than in the highly expressed late group (7%) (FDR=0.02). A significantly greater percent of the highly expressed early genes (14%) had the GO term cellular metabolic process than the highly expressed late genes (6%) (FDR=0.04). In the last GO domain, cellular component (Figure 3.6.C), the GO term intracellular membrane-bounded organelle was found more often among the highly expressed early genes (12%) than the highly expressed late genes (4%)

(FDR=0.004). The GO term cytoplasmic part was also more common among highly ex- pressed early genes (14%) than the highly expressed late genes (6%) (FDR=0.04).

Genes related to wood decay and preservative tolerance were first mined from the differential GO term lists. The mining process began by reviewing the annotations for each gene. As more genes were identified that had related functions and target motifs,

85

then their co-regulation served to substantiate the importance of their metabolic end product or the cellular function that resulted from their combined activities. Once the metabolite or cellular function was identified, additional genes that were missing from the GO term list were identified by searching the list of 917 differentially expressed genes. The mining process was somewhat analogous to a puzzle. As more puzzle pieces

(genes) were put into place, then the larger picture (metabolite or cellular function) be- came clearer, which made it easier to find additional pieces (genes from the list of 917 differentially expressed genes) to fit into the picture. The entire list of all genes that were mined by this process is shown in Table 3.1.

A highlyUp-regulated expressed early Up-regulatedhighly expressed late late

hydrolase activity

binding Function Molecular

0 5 10 15 20 25 30 35 40 45 50 55 60 65 B Up-regulated early Up-regulated late lipid metabolic hydrolaseprocess activity

cellular metabolic

Process binding Biological process 0 5 10 15 20 25 30 35 40 45 50 55 60 65 C Up-regulated early Up-regulated late intracellular hydrolasemembrane- activity bounded organelle

binding

Cellular cytoplasmic part Component 0 5 10 15 20 25 30 35 40 45 50 55 60 65 Percent Genes

Figure 3.6 Differential GO term distribution for the three domains: molecular func- tion (A), biological process (B), and cellular component (C). Each term showed a significant difference in percent genes represented between the highly expressed early and late groups (FDR threshold, 0.05). The percent gene calculation was based on the total number of genes within each group.

86

Table 3.1 List of the 108 differentially expressed genes (FDR < 1E-4), their fold changes, and their proposed roles (in italics) during decay of wood treated with a copper-based preservative. Red and blue fold change values indicate genes that were more highly expressed at days 31 and 154, respectively. Target motif abbreviations: P, peroxisomal; S, secretory; M, mitochondrial; SP, signal peptide; SA, signal anchor. Target motifs (S, M, SP, and SA) were assigned by the genomic analysis (Chapter II).

Gene Target Fold Gene Annotation or Role Identifier Motif Change Oxalate Biosynthesis

415 isocitrate lyase P 8.2 1257 glyoxylate dehydrogenase P 4.9 6889 citrate synthase 4.3 5088 succinate/fumarate antiporter SP 4.0 7902 aconitate hydratase 2.3 4245 2-oxoglutarate dehydrogenase e1 2.3

Oxalate Degradation

7157 oxalate decarboxylase S/SP 7.0 6399 oxalate decarboxylase S/SP 4.0

Oxidative Phosphorylation

5697 mitochondrial phosphate carrier protein SP 9.8 8032 hemerythrin domain protein 5.3 5659 alternative oxidase 4.4 9070 cytochrome c 3.3 2563 NADH-ubiquinone oxidoreductase M/SA 3.3 718 cytochrome c peroxidase M/SA 2.6

Hydroquinone (Phenolate) Reduction

4739 laccase S/SP 11.2

87

Table 3.1 (continued)

Iron Reduction

7504 hypothetical iron reductase SA 2.5

750 hypothetical iron reductase 3.2

6327 zip-like iron transporter 4.4

4475 ferric reductase transmembrane component 2 precur- SA 2.4 sor Quinone (Phenolate) Biosynthesis

1487 tyrosinase 4.7

1213 phenylalanine ammonia lyase 4.0

Copper Resistance

1430 copper resistance-associated P-type ATPase 2.5

Stress Response

1533 casein kinase # 1-like 4.5 2947 cmgc mitogen-activated protein kinase 2.5 2071 mitogen-activated protein kinase 2.7 7796 mitogen-activated protein kinase phosphatase 3 4.8 1147 transcription factor 4.5 2241 transcription factor sfp1 3.5 3983 transcriptional regulator prz1 3.5

Xenobiotic Degradation

* 18 cytochrome P450 genes (7 genes > 4x) S/SP or SA 2.4-8.6

Degradation of Extractives

* 9 cytochrome P450 genes (3 genes > 4x) S/SP or SA 2.8-13.5

Pectin Degradation

1513 GH 28 polygalacturonase S/SP 4.0

2569 GH43 endo-#-L-arabinanase S/SP 2.7

3577 GH28 exo-rhamnogalacturonase b S/SP 4.0

88

Table 3.1 (continued)

Hemicellulose Degradation

2726 GH10 endo-"-xylanases S/SP 16.6

2724 GH10 endo-"-xylanases S/SP 6.1 2727 GH10 endo-"-xylanases S/SP 5.8 1713 GH3 "-xylosidase S/SP 3.3 7952 GH5 endo-"-mannanase S/SP 4.4 7362 GH2 endo-"-mannosidase S/SP 4.3 187 GH2 "-mannosidase S/SP 2.7 583 GH53 arabinogalactan endo-"-galactosidase S/SP 4.3 1433 GH43 galactan-1,3-"-galactosidase S/SP 2.7

2755 GH115 #-glucuronidase S/SP 4.1 Cellulose Degradation

509 GH5 endo-1,4-"-glucanases S/SP 19.7 508 GH5 endo-1,4-"-glucanases S/SP 9.5 4759 GH5 "-glucosidase S/SP 4.7 3737 GH3 "-glucosidase S/SP 5.0 1962 GH61 S/SP 7.0

Sugar Transport

7926 mfs sugar transporter 4.2 8655 sugar transporter M/SA 2.9 9158 hexose transporter 5.4 6103 hexose transporter SA 3.6 7081 hexose transporter 2.9 7444 hexose carrier S/SP 3.6

89

Table 3.1 (continued)

Hydroquinone (Phenolate) Degradation and Proton Reduction

8299 aryl alcohol oxidase S/SP 5.9

7478 copper radical oxidase S/SP 5.5

6682 catalase S/SP 2.9

Re-modeling Cell Membrane for Acid Tolerance

5564 squalene synthetase S/SP 23.5

5558 squalene synthetase 11.1

2210 cyclopropane-fatty-acyl-phospholipid synthase 4.5

7624 $-12 fatty acid desaturase 5.3 7689 lysophospholipase plb1 S/SP 7.8 7325 lysophospholipase plb1 S/SP 3.6 2492 lipase class 3 4.9 6060 GDSL lipase acylhydrolase family protein 5.3 8157 GDSL lipase acylhydrolase family protein S/SP 4.8 5409 carbohydrate esterase family 10 protein S/SP 7.5 8303 carbohydrate esterase family 10 protein S/SP 4.1 7565 carbohydrate esterase family 10 protein 4.2 4208 carbohydrate esterase family 10 protein 3.3 8701 carbohydrate esterase family 16 choline esterase S/SP 5.6

Re-modeling of the Glucan Sheath

1342 GH55 glucan-1,3-"-glucosidase S/SP 11.7 6334 GH47 #-mannosidase S/SP 4.0 782 GH18 chitinase S/SP 3.2

7135 GH30 glucan endo-1,6-"-glucosidase S/SP 2.4

3723 GH12 endo-1,3(4)-"-glucanase M**/SP 20.8 7277 GH16 endo-1,3(4)-"-glucanases S/SP 4.6

90

Table 3.1 (continued)

6272 GH16 endo-1,3(4)-"-glucanases S/SP 4.1 6020 GH79 "-glucuronidase S/SP 3.8 4867 GH89 alpha-N-acetylglucosaminidase S/SP 3.3

4006 GH55 glucan-1,3-"-glucosidase S/SP 3.1 424 GH18 chitinase S/SP 2.5 * Too many genes to list individually. ** A mitochondrial motif seems unlikely and could have been a mistake in the assembly at the 5' end of the sequence.

Many of the differences in GO term abundance were due to increased expression of transcripts that were related to oxalate production at the early decay time point when the fungus was adapting to copper (Figure 3.7, Table 3.1). These transcripts had putative functions for citrate synthase (4.3x), isocitrate lyase (8.2x), glyoxylate dehydrogenase

(description was mitochondrial cytochrome but top blast hit was glyoxylate dehydro- genase, 4.9x), and a succinate/formate antiporter (signal peptide motif, 4.0x). Citrate synthase produces citrate from oxaloacetate and acetyl-CoA and occurs in both the TCA and the GLOX cycles. The TCA cycle takes place within the mitochondrial matrix, whereas the GLOX cycle occurs inside peroxisomes (Sakai et al. 2006). Isocitrate lyase, which cleaves isocitrate to glyoxylate and succinate, is the first committed step in the

GLOX cycle. Glyoxylate dehydrogenase is the next step in the GLOX cycle. It oxidizes glyoxylate to oxalate while reducing NAD(P)+. The role of succinate/fumarate antiport- ers is to transport succinate across organellar membranes in exchange for another C4 me- tabolite (Munir et al. 2001; Sakai et al. 2006). Examination of the C-terminal predicted tripeptides showed that only isocitrate lyase and glyoxylate dehydrogenase genes had peroxisomal target signals at the C-terminus (AKL and SKL, respectively).

91

Genes that did not show up in the GO term analysis, but exhibited increased ex-

pression at the early time point and were related to oxalate metabolism, were aconitate

hydratase (2.3x) and oxalate decarboxylase (7.0x) (Figure 3.7, Table 3.1). Aconitate hy-

dratase catalyzes the conversion of citrate to isocitrate and occurs in both the TCA and

the GLOX cycles, while oxalate decarboxylase degrades oxalate to formate and CO2.

Interestingly, a different oxalate decarboxylase transcript was more highly expressed dur- ing late decay when wood showed high strength loss (4.0x). Both oxalate decarboxylase genes carried motifs for secretion and signal peptides indicating an extracellular function.

Another gene which was more highly expressed at the early time point was a probable 2- oxoglutarate dehydrogenase e1 component (2.3x). The protein product of this gene func- tions in the TCA cycle converting 2-oxoglutarate to succinyl-CoA.

We detected many highly expressed early genes that had putative functions re- lated to energy production in the mitochondria (Table 3.1). In several of the GO term categories, we observed genes for a putative mitochondrial phosphate carrier protein (mo- tif for a signal peptide but not for secretion, 9.8x), cytochrome c (3.3x), cytochrome c peroxidase (signal anchor for mitochondria, 2.6x), NADH-ubiquinone oxidoreductase

(signal anchor for mitochondria, 3.3x), a hemerythrin domain protein (5.3x), and an al- ternative oxidase (4.4x). Mitochondrial phosphate carrier proteins supply phosphate to the mitochondria for ATP production. Cytochrome c is an electron acceptor in the elec- tron transport chain that is loosely bound to the inner mitochondrial membrane. Cyto- chrome c peroxidase reduces H2O2 using electrons from cytochrome c. NADH- ubiquinone oxidoreductase is complex I of the electron transport chain. Hemerythrin proteins carry O2, and alternative oxidase provides an alternate pathway for electron

92

transport that reduces oxidative stress, but produces less ATP because it has fewer pro- ton-pumping steps.

pyruvate acetyl-CoA

CS Citrate Oxaloacetate 4.3x Oxaloacetate Citrate NADH NAD+ AH NADH 2.3x AP Malate (A) 4.0x (B) Isocitrate Isocitrate NAD+ 2-Oxoglutarate ICL ODH AP 8.2x 2.3x 4.0x Succinate + Fumarate Succinyl-CoA Malate Malate Glyoxylate Succinate NAD+ GDH NADH 4.9x Oxaloacetate Oxalate FADH2 FAD Acetyl-CoA Acetate + Oxalate Intracellular

ODC Extracellular ODC 7.0x 4.0x Oxalate Oxalate Formate + CO2 Formate + CO2 Figure 3.7 Differential expression of gene models from (A) the TCA cycle and (B) the GLOX cycle that were regulating oxalate production (FDR = 1E-4). Red, more highly expressed early; blue, more highly expressed late. Gene ab- breviations: CS, citrate synthase; AD, aconitate hydratase; AP, succi- nate/fumarate antiporter; ICL, isocitrate lyase; ODH, 2-oxoglutarate dehy- drogenase e1 component; GDH, glyoxylate dehydrogenase; ODC, oxalate decarboxylase; Pathway is modified from Munir et al. (2001).

Transcripts in the highly expressed early GO groups that were related to wood polysaccharide breakdown had putative functions for degradation of pectins and laccase- driven Fenton chemistry (Table 3.1). The glycoside hydrolase gene families (GH fol- lowed by a number) found were a GH28 polygalacturonase (4.0x) and a GH43 endo-#-L- arabinanase (2.7x). Both genes had motifs for secretion and signal peptides for extracel- lular localization of the enzyme products. There was also a highly expressed laccase

93

gene (11.2x) that carried the motifs indicative of an extracellular function. Laccases are

multicopper oxidases that are implicated in the production of the Fenton reaction during

oxidative decay by brown rot fungi because they catalyze the one electron abstraction of

many phenolic compounds including hydroquinones to produce semiquinone radicals

(Kerem et al. 1999; Varela and Tien 2003; Wei et al. 2010).

Fenton chemistry could also involve enzymatic reduction of iron, quinone, and

enzymatic production of H2O2. We did not, however, find strong evidence for any of these three processes based on the magnitude of the fold change, correlation with the time of laccase increased expression, and/or target motifs (Table 3.1). In the GO term group- ings, we found one hypothetical iron reductase transcript that was highly expressed early

(2.5x) and one Fe3+ reductase transmembrane component 2 precursor that was highly ex-

pressed late (2.4x). Both genes had motifs for signal anchors but not for secretion. Addi-

tional genes were mined from the list of 917 differentially expressed genes. They in-

cluded a hypothetical iron reductase (3.2x) and a zip-like iron transporter (4.4x). Both

genes showed highly expressed late expression and neither gene had localization motifs.

No differentially expressed NADH quinone oxidoreductase was detected. We did, how-

ever, find many differentially expressed oxidoreductase genes, some of which had secre-

tion motifs. Without more specific functional assignments, though, it was impossible to

link them to quinone reduction. There were two genes, an aryl alcohol dehydrogenase

and a short chain dehydrogenase reductase that were among the most highly expressed

early genes (58x and 20x, respectively). They both lacked target motifs, suggesting a cy-

tosolic function, but their metabolic roles remain unclear. Genes that could be regulating

quinone biosynthesis were also detected. In the GO term list, a gene for phenylalanine

ammonia lyase (4.0x) and tyrosinase (4.7x) were both highly expressed early. The for-

94

mer converts phenylalanine to trans-cinnamic acid and ammonia, while the latter is a monophenol monooxygenase that converts catechol to benzoquinone. The lack of motifs suggest localization to the cytosol.

The remaining notable highly expressed early genes in our GO term analysis in- volved signal transduction (casein kinase # 1-like, 4.5x; cmgc mitogen-activated protein kinase, 2.5x; mitogen-activated protein kinase, 2.7x; mitogen-activated protein kinase phosphatase, 4.8x), transcription induction (transcription factor, 4.5x; transcription factor sfp1, 3.5x; transcriptional regulator prz1, 3.5x), and resistance to copper (copper resis- tance-associated P-type ATPase, 2.5x) (Table 3.1). There were also many genes for cy- tochrome P450s (18 genes had secretion motifs with signal peptides or signal anchors, 7 of which also had fold changes greater than 4x) (Table 3.1). Protein kinases are often involved in transducing and amplifying the cellular response to stressful extracellular stimuli. Transcription factors are recruited to promote binding of RNA polymerase II, which initiates gene expression. Copper-resistance P-type ATPase is a copper-specific

ATPase pump that transports copper across membranes and has been found to confer copper resistance in the fungus Candida albicans (Weissman et al. 2000). Cytochrome

P450s belong to a large and diverse superfamily of proteins that oxidize a wide variety of organic substances including many xenobiotics. A smaller number of transcripts with cytochrome P450 annotations also appeared in the highly expressed late GO term group- ings (9 genes had the necessary motifs for extracellular function, 3 of which had fold changes greater than 4x).

Many genes that were represented in the highly expressed late GO groups had pu- tative functions related to the extracellular degradation of hemicellulose, and cellulose

(Table 3.1). For hemicellulose degradation, we discovered three GH10 endo-"-xylanases

95

(16.6x, 6.1x, 5.8x), a GH3 "-xylosidase (3.3x), a GH5 endo-"-mannanase (4.4x), a GH2

endo-"-mannosidase (4.3x), a GH2 "-mannosidase (2.7x), a GH53 arabinogalactan endo-

"-galactosidase (4.3x), and a GH43 galactan-1,3-"-galactosidase (2.7x). For cellulose

degradation, we found two GH5 endo-1,4-"-glucanases (19.7x, 9.5x), a GH3 "- glucosidase (5.0x), and a GH5 "-glucosidase (4.7x). Another two genes were mined from the list of 917 differentially expressed genes: a GH61 gene (7.0x) and a GH115 #- glucuronidase gene (4.1x) (Table 3.1). GH61 is a family of non-canonical glycoside hy- drolases that exhibit weak endo-1,4-"-D-glucanase activity, but when mixed with en- doglucanases, enhance the breakdown of lignocellulose. GH115 has two members, xylan

#-1,2-glucuronidase and #-(4-O-methyl)-glucuronidase, functions for hemicellulose breakdown. Only one transcript was highly expressed late and involved in pectin degra- dation. It was a GH28 exo-rhamnogalacturonase b (4.0x). All the glycoside hydrolase genes mentioned had motifs for secretion and signal peptides, suggesting extracellular function for their gene products. In total, we identified 16 transcripts from 9 different glycoside hydrolase families that were highly expressed late (Table 3.1).

High rates of extracellular polysaccharide breakdown imply a concurrent increase in sugar transport. In the list of 917 differentially expressed genes, we detected three hexose transport genes that showed increased expression during the late stage of decay when wood showed high strength loss, one with a signal anchor (3.6x), and two without

(5.4x, 2.9x), as well as one hexose carrier that had a signal peptide (3.6x). Only the hexose carrier also exhibited a motif for secretion. This was compared to two sugar transporters that were more highly expressed at the early time point during copper adap- tation. One was a sugar transporter (signal anchor for mitochondria, 2.9x) and the other

96

was a major facilitator superfamily sugar transporter that had no target motifs (4.2x). All six transport genes are listed in Table 3.1

Other highly expressed late genes in the GO term groups involved extracellular

H2O2 production by an aryl alcohol oxidase (5.9x) and degradation by a catalase (2.9x).

A second gene for H2O2 production, a copper radical oxidase (5.5x) was found in the list of 917 differentially expressed genes. All three genes had motifs for secretion and signal peptides, indicative of extracellular function (Table 3.1).

Genes for lipid metabolism were highly expressed late (Table 3.1) and were found in several GO term groups. Some of the transcripts identified for lipid biosynthesis were two highly expressed squalene synthetases, one with an extracellular function (23.5x) and one without (11.1x). Increased expression of a cyclopropane-fatty-acyl-phospholipid synthase gene (4.5x) and a $-12 fatty acid desaturase gene (5.3x) were also observed.

Both probably functioned in the cytosal as their genes lacked target motifs. Squalene synthetases synthesize squalene, which is the first committed step for sterol biosynthesis.

Cyclopropane-fatty-acyl-phospholipid synthase introduces a cyclopropane ring into acyl chains of phospholipids. $-12 fatty acid desaturases are oxidoreductases that remove two hydrogen atoms to insert a double bond at the C12 position of a fatty acid.

Highly expressed late genes related to lipid catabolism from the GO groups in- cluded two lysophopholipase plb1 genes (both with secretion motifs; 7.8x, 3.6x), a lipase class 3 gene (no motifs, 4.9x), two GDSL lipase acylhydrolase family proteins, one with secretion motifs (4.8x) and one without (5.3x), four carbohydrate esterase family 10 pro- teins, two with secretion motifs (7.5x, 4.1x) and two without (4.2x, 3.3x), and a hypo- thetical carbohydrate esterase family 16 choline esterase, also with motifs for secretion

97

(5.6x). These 11 lipid catabolism genes all have roles in membrane glycerophospholipid

catabolism.

Our last set of results pertain to the glycoside hydrolase genes that had functions

related to re-modeling the glucan sheath of the fungal cell wall to accommodate the dif-

ferent extracellular conditions and activities between the early and late time points (Table

3.1). All genes had motifs for secretion and signal peptides unless indicated otherwise.

The four highly expressed early transcripts were: a GH55 glucan-1,3-"-glucosidase

(11.7x), a GH47 #-mannosidase (4.0x), a GH18 chitinase (3.2x), and a GH30 glucan endo-1,6-"-glucosidase (2.4x). The seven highly expressed late transcripts were: a

GH12 endo-1,3(4)-"-glucanase (signal peptide for mitochondria, 20.8x), two GH16 endo-

1,3(4)-"-glucanases (4.6x and 4.1x), a GH79 "-glucuronidase (3.8x), a GH89 alpha-N- acetylglucosaminidase (3.3x), a GH55 glucan-1,3-"-glucosidase (3.1x), and a GH18 chitinase (2.5x). All genes were found in the differentially GO term list except the GH79 gene which was found in the list of 917 differentially expressed genes. The mitochondria motif of the GH12 gene seems unlikely and could have been a mistake in the assembly at the 5' end of the sequence.

Discussion Our gene expression analysis identified many key genes that appeared to regulate the response to toxic levels of copper and organic biocide, and utilization of wood as a growth substrate. Not only did we find highly significant, large fold increases of many genes (> 4x, FDR < 1E-4), but we also found genes that were highly expressed together and in the appropriate cellular compartment, whether in the same pathway or mediating a related function, which served as corroborating evidence that a particular biological re-

98

sponse was actively being controlled by the fungus. We attribute these pronounced re-

sults to our experimental design, which forced the fungus into two physiological states

that were at different extremes in terms of stress and stage of decay.

In the early decay phenotype, we were exploiting the natural copper tolerance ex-

hibited by brown rot fungi to investigate how the fungus regulates increased oxalate pro-

duction. We observed concurrent increases in genes that controlled hydroquinone pro-

duction, laccase production, and oxidative phosphorylation. This suggested strong meta-

bolic links between oxalate production, laccase-driven Fenton chemistry, and energy pro-

duction. Furthermore, the presence of an organic co-biocide in the wood treatment trig-

gered a stress response that included many cytochrome P450 genes that mediated detoxi-

fication of the DDAC. In the late phenotype, the wood exhibited high strength loss

which led to the discovery of a panoply of genes that were directing the degradation of

the wood polysaccharides and hexose transport.

Other biological activities that appeared to be important when wood exhibited

high strength loss were arresting Fenton chemistry, raising the extracellular pH to 4.0,

and remodeling the outer layers of the cell. Arresting Fenton chemistry involved the

combined action of H2O2-producing oxidases to oxidize laccase substrates and remove

excess protons, oxalate decarboxylase to raise pH, and catalase to remove the H2O2. The increased expression of many cytochrome P450s would then ensure the complete break- down of the oxidized laccase substrates. We also inferred that the increase in squalene/sterol biosynthesis, cyclopropane fatty acid synthesis, and membrane glycero- phospholipid catabolism were necessary to protect the plasma membrane against the acid conditions it would be exposed to during the steady efflux of glycoside hydrolases and influx of hexoses. The observed differential regulation of glycoside hydrolases involved

99

in re-modeling of the glucan sheath also support dynamic changes in the cell wall in re- sponse to different environmental stresses and stages of decay.

Although the importance of oxalate in wood decay and copper tolerance has been long recognized, we are the first group to identify the specific genes that regulate in- creased oxalate production. Based on the magnitude of the fold change (8x), our data identified a peroxisomal isocitrate lyase gene as a critical rate-limiting step for oxalate biosynthesis. This seems reasonable since isocitrate lyase is the first committed step of oxalate biosynthesis, producing glyoxylate, the only metabolite in the GLOX cycle that is not also in the TCA cycle. Previous studies in P. placenta demonstrated functional TCA and GLOX cycles, but significant differential fold changes were not clearly demonstrated

(Martinez et al. 2009; Vanden Wymelenberg et al. 2010; Vanden Wymelenberg et al.

2011). We attribute this difference to the presence of the high copper concentrations in our preservative treatment of the wood.

Other highly expressed transcripts related to increased oxalate biosynthesis were citrate synthase (4x), aconitate hydratase (2x), 2-oxoglutarate dehydrogenase (2x), a suc- cinate/fumarate antiporter (4x), and glyoxylate dehydrogenase (5x). These genes also served as control points for increased oxalate biosynthesis but the fold change differences were not as high as for isocitrate lyase. Since peroxisomal target signals were only de- tected for isocitrate lyase and glyoxylate dehydrogenase genes, our results indicate that increased oxalate biosynthesis stemmed from increased production of citrate and iso- citrate in the TCA cycle of the mitochondrial matrix and then glyoxylate and oxalate pro- duction in the peroxisome. This would explain why we observed concurrent increased expression of a succinate/fumarate antiporter. We believe that the antiporter was trading isocitrate produced in the mitochondria for succinate produced in the peroxisome. Sakai

100

et al. (2006) came to a similar conclusion when they observed isocitrate lyase activity in

the peroxisome of F. palustris. P. placenta (Martinez et al. 2009) and Saccharomyces

cerevisiae (Lee et al. 2000), however, exhibited peroxisomal target sequence motifs for

citrate synthase, suggesting that basal rates of oxalate biosynthesis may occur primarily

within the GLOX cycle.

Enzymatic studies in F. palustris suggest that the terminal steps of oxalate pro-

duction involved the conversion of glyoxylate to malate within the peroxisome, then transportation of malate to the cytoplasm, where malate was oxidized to oxaloacetate, and

oxaloacetate cleaved to oxalate and acetate (Munir et al. 2001). These conclusions were

based on greater activity levels of oxaloacetate hydrolase than glyoxylate dehydrogenase.

Our gene expression results, instead, suggest that when oxalate production is up-

regulated, a much shorter route of oxalate production takes effect. Oxalate is synthesized

directly from glyoxylate by glyoxylate dehydrogenase within the peroxisome, bypassing

the intermediates, malate and oxaloacetate. Furthermore, since we did not detect differ-

ential expression of mono- and dicarboxylate transporters, our data suggest that transport

of oxalate to the exterior was not a rate-limiting step.

A consequence of increased oxalate production is increased energy production

(Figure 3.7). For each succinate molecule that enters the mitochondria from cleavage of

isocitrate by isocitrate lyase, one FADH2 and one NADH are produced by the TCA cycle for ATP production. This metabolic coupling between the GLOX and the TCA cycle, which was discovered by Munir et al. (2001), would explain why we observed increased expression of six genes controlling oxidative phosphorylation and O2 transport at the

same time that we saw increased expression of genes controlling oxalate production. The

strong ties between increased gene expression for oxalate production and energy produc-

101

tion suggest that brown rot fungi up-regulate oxalate levels to meet increased energy

needs. Based on our results, response to stress would qualify as a period of high energy

demand.

Other than up-regulating gene expression to increase flux through the oxalate bio- synthesis pathway, our results also suggest that F. radiculosa was actively controlling the rate of extracellular oxalate degradation. The magnitude of the fold change for an ex- tracellular oxalate decarboxylase gene (7x), which degrades oxalate, was almost as high as the fold change for isocitrate lyase (8x). Moreover, increased expression of a different oxalate decarboxylase at the late time point (4x) without increased oxalate production

indicated that a the role of oxalate degradation depended upon the amount of oxalate be-

ing produced, the bioavailability of copper, and the stage of decay of the treated wood. It

appears that copper removal by oxalate precipitation was controlled by high rates of ox-

alate production and degradation, while much lower rates of oxalate production and mod-

erate rates of oxalate degradation were associated with arresting Fenton chemistry when

wood exhibited high strength loss. Clearly, the brown rot fungus was regulating both ox-

alate production and oxalate degradation to control Fenton chemistry.

As we indicated earlier, high oxalate production and high energy output was

probably a physiological response to the stress caused by the wood preservative. This

stress response was possibly mediated by the four kinases and three transcription-related

factors that exhibited increased expression concurrently with the oxalate and energy pro- duction genes. Another indication that the wood preservative was the trigger for the stress response was the increase in expression of a transcript for a copper resistance- associated P-type ATPase pump. Although we saw only a 2.5x increase of the copper resistance gene, this gene differentiates whether an organism will survive or die in a high

102

copper environment (Weissman et al. 2000). Copper is a micronutrient that is actively taken up by cells, so organisms that survive in high copper environment must have a mechanism to actively extrude copper from the cytosol. Up until this point, copper toler- ance in brown rot fungi had only been attributed to increased oxalate levels (Clausen and

Green 2003; Green and Clausen 2003) and copper oxalate precipitation (Jarosz-

Wilkolazka and Gadd 2003; Fomina et al. 2005). Our results indicate that survival is also probably mediated by up-regulation of a copper-resistance ATPase pump.

Another component of the stress response was the increase in expression of 26 cy- tochrome P450 genes (18 with secretion motifs, 13 > 4x). Many of these genes were probably involved in the detoxification of the DDAC, which at the early time point was still protecting the wood from decay along with the copper. Once the wood exhibited

52% strength loss, we observed a different set of 16 cytochrome P450 genes that were highly expressed (9 with secretion motifs, 7 > 4x). These genes were most likely causing enzymatic degradation of small organic molecules like the oxidized laccase substrates or extractives such as lipids and polyphenols that would become more accessible once the lignocellulose matrix became more porous. About 1-5% of the dry weight in softwood is in the extractive fraction and they are composed primarily of resin acids, fatty acids, monoterpenes, and phenolics. The dramatic contrast between the two time points in the number and fold changes in cytochrome P450 genes indicate the dynamic range of flexi- bility that brown rot fungi have for detoxifying xenobiotics and degrading organic com- pounds. In P. placenta, Vanden Wymelenberg et al. (2011) also observed increased ex- pression of 15 cytochrome P450 genes in liquid cultures of balled milled pine compared to aspen. Since softwood has more extractives than hardwood, they concluded that there

103

were many cytochrome P450 genes that could be regulating degradation of the softwood extractives.

The demand for energy during a xenobiotic-induced stress response has also been observed in the white rot fungus, Phanerochaete chrysosporium (Shimizu et al. 2005).

The effect of exogenous vanillin on protein levels showed increased activity of citrate synthase (2.6x), aconitate hydratase (5.1x), ATP synthase (2.1x), and 2 enzymes for heme biosynthesis. The authors also observed a drastic increase in enzyme activity of isocitrate dehydrogenase (920x) compared to isocitrate lyase (3.7x), although neither of these latter two enzymes were quantified in their 2D gels. Isocitrate dehydrogenase occurs in the

TCA cycle and catalyzes oxidative decarboxylation of isocitrate to 2-oxoglutarate and

CO2. P. chrysosporium, however, like most white rot fungi, is not copper-tolerant and cannot produce high levels of oxalate. Therefore, it appears that they have evolved a dif- ferent mechanism to meet the stress-induced energy demand, and that is by relying on up- regulation of isocitrate dehydrogenase in the TCA cycle. This is in contrast to our data, which suggests that brown rot fungi rely on up-regulation of isocitrate lyase in the GLOX cycle.

Another notable discovery was that we observed concurrent increases in expres- sion of an extracellular acting laccase gene (11x) with gene expression for high oxalate production. This is the first report demonstrating co-regulation of a laccase gene with genes controlling oxalate production. The significance of this connection is that it pro- vides in vivo data supporting laccase-driven hydroxyl free radical production in the pres- ence of high oxalate. It is generally accepted that brown rot fungi have evolved the abil- ity to use Fenton chemistry to produce hydroxyl free radicals for decay initiation

(Koenigs 1974; Kirk et al. 1991; Jensen et al. 2001; Arantes et al. 2009a). Laccase plays

104

a central role in this proposed mechanism because it is an enzyme that can form semiqui-

none (or phenolate) radicals which have high enough reducing potential to reduce iron in

the presence of high oxalate (Wei et al. 2010). A complete Fenton system in the presence

of high oxalate can produce hydroxyl free radicals because the semiquinone radicals par-

ticipate in several chemical reactions that not only include reduction of iron in the iron

oxalate complex but also include reduction of O2 to H2O2 (through dismutation) and re- duction of quinone back to the hydroquinone (Jensen et al. 2002; Wei et al. 2010). Thus, in the presence of high oxalate, a complete Fenton system does not require any other en- zyme than laccase (Wei et al. 2010). This may explain why, at the early time point, we found increased expression of genes for quinone and phenolate production (phenylam- monia lyase 4x and tyrosinase 5x), but no concurrent increased expression of genes for extracellular iron reductase, quinone reductase, or genes for extracellular H2O2 produc- tion. The random action of free radicals would also seem to limit extensive extracellular enzymatic activities.

Our data also suggest that the laccase-driven Fenton system for hydroxyl free radical decay of wood was occurring at the same time as copper oxalate precipitation.

Speculating further, we propose that the reason genes for laccase production and oxalate metabolism are co-regulated is because the conditions that are conducive to metal oxalate precipitation also promote oxidative decay by laccase-driven Fenton systems. This may explain why white rot fungi that express laccase in the absence of high oxalate have not been able to exploit Fenton chemistry to the extent that brown rot fungi have. Thus, we propose that the evolution of brown rot fungi from white rot fungi (Hibbett and

Donoghue 2001) not only involved loss of many gene families that are typical of white rot decay, like cellobiose dehydrogenase, lignin and manganese peroxidases, etc.

105

(Martinez et al. 2009), but also hinged on their ability to over-produce oxalate at the same

time as laccase. Only then, were brown rot fungi able to exploit the highly efficient lac-

case-driven Fenton systems to initiate wood decay. The link between oxalate and energy

production was probably another conferred advantage that undoubtedly contributed to the

selection of these three traits.

Our analysis of the putative genes involved in wood polysaccharide degradation

was also highly informative. We observed increased expression of genes for two ex-

tracellular pectin degrading enzymes, a GH28 polygalacturonase (4x) and a GH43 endo-

#-L-arabinanase (2.7x) at the early time point. Pectins are soluble wood polysaccharides.

They include galacturonans, rhamnogalacturonans, arabinans, and galactans (Sjostrom

1993) and comprise < 1% of the dry weight of softwood (Anderson 1946). Pectin rich regions include the torus of pit membranes, the middle lamella, the cell corners, and the parenchymal cells (Schwarze 2007). Based on studies showing that brown rot fungi colonize the tracheid cells of wood by using pectinases to grow through the pit mem- branes (Daniel 1994; Green and Clausen 1999), we surmise that expression of these two genes was increased to regulate degradation of the pectic substances in the torus region of the pit membranes. Pit apertures also connect tracheid cells with parenchymal cells, but parenchymal cells appear to resist degradation by brown rot fungi (Schwarze 2007).

Evidently, expression of the polygalacturonase and the endo-arabinanase along with the genes that support a laccase-driven Fenton system were not sufficient to cause strength loss to the wood by day 31. This was in sharp contrast to the late time point, day

154, during which the wood exhibited 52% strength loss and the fungus was character- ized by the simultaneous increased expression of 16 glycoside hydrolases from 9 differ- ent families. Based on the magnitude of the increase in expression levels (most from 4x

106

to 20x), the kinds of bonds hydrolyzed, the sugars involved, and the loss of wood strength, we conclude that this assemblage of genes was regulating the complete hydroly- sis of the structural polysaccharides and the remaining pectic substances in the pine wa- fer. Roughly 40% of softwood is cellulose and 30% is hemicellulose (Sjostrom 1993).

Three kinds of hemicellulose (in order of decreasing abundance) are present in softwood: galactoglucomann, arabino-4-O-methyl-#-glucuronoxylan, and arabinogalactan

(Sjostrom 1993). The multitude of genes involved and the kinds of glycoside hydrolases represented showed overall similarity to the glycoside hydrolases found in gene expres- sion studies of P. placenta (Martinez et al. 2009; Vanden Wymelenberg et al. 2010;

Vanden Wymelenberg et al. 2011). The major difference was that fold changes in P. pla- centa were typically < 2x, fewer wood-related glycoside hydrolases were differentially expressed (P < 0.05 or 0.01), and we attempted to separate glycoside hydrolase genes that might function in fungal cell wall re-modeling from those that involved in wood de- cay (Martinez et al. 2009; Vanden Wymelenberg et al. 2010; Vanden Wymelenberg et al.

2011).

When we compared the differentially expressed GH families we found with those that were detected by LC-MS/MS in culture filtrates of P. placenta grown in balled milled pine, we found six families that were shared by both species: GH2, GH3, GH5,

GH10, GH28, and GH43. The GH61 and GH115 genes were observed only in F. radicu- losa. Although we have not yet done expression or enzyme assays to characterize the in vivo function of the Fibroporia genes, their functional annotations indicate that they were likely regulating enzyme activities like those described for other brown rot fungi for the degradation of cellulose and hemicellulose (Cohen et al. 2005; Valaskova and Baldrian

2006). Activities for xylanase, endoglucanase, and processive endoglucanase were found

107

(Cohen et al. 2005; Valaskova and Baldrian 2006), as well activities for endo-1,4-"- mannanase, 1,4-"-glucosidase, 1,4-"-xylosidase, and 1,4-"-mannosidase (Valaskova and

Baldrian 2006). We also observed increased expression of a GH28 exo- rhamnogalacturonase b gene (4x). We believe that this exo-rhamnogalacturonase was regulating degradation of the pectic substances from the middle lamella and the cell cor- ners. Structurally, these regions are furthest from the tracheid lumen and would only be- come accessible once the lignocellulosic matrix became more porous.

Another unique aspect of our study was the identification of three hexose trans- port genes and one hexose carrier gene that appeared to be up-regulated at the same time as the set of 15 glycoside hydrolases. One hexose transport gene had a signal anchor (4x) suggesting a membrane bound location, one hexose carrier had a signal peptide (4x) in- dicative of an extracellular function, and the other two transport genes lacked a motif (5x,

3x), most likely functioning in the cytosol. This suggests that large amounts of glucose and/or galactose, the two major hexoses of wood, were being released, translocated across the plasma membrane, and carted to their various destinations within the cell. This contrasted with only two generic sugar transporters that showed increased expression at the early time point, neither of which had signal peptides (4x, 3x).

Our results also suggest that transcripts with putative roles in extracellular H2O2

production and catabolism were up-regulated when wood showed high strength loss. We

found increased expression of genes for an aryl alcohol oxidase (6x) and a copper radical

oxidase (6x) that were probably controlling H2O2 production, along with increased ex- pression of a catalase transcript (3x) for H2O2 degradation. The most intriguing aspect

was that expression of these three genes was not concurrent with the increase in laccase,

indicating that extracellular H2O2 metabolism was not related to laccase-driven hydroxyl

108

free radical production. For years, though, the assumption has been that the H2O2 pro-

duced enzymatically by brown rot fungi, whether by FAD-binding glucose-methanol-

choline oxidases like methanol oxidase (Daniel et al. 2007), aryl alcohol oxidase

(Fernandez et al. 2009), and pyranose oxidase (Daniel et al. 1994), or by copper radical

oxidases (CRO) that resemble glyoxal oxidase of white rot fungi (Kersten 1990), was a

reactant for Fenton chemistry.

Our results suggest a very different role, that is the two oxidases were arresting

Fenton chemistry by oxidizing the laccase substrates while reducing O2 to H2O2. The catalase would then ensure prompt removal of the H2O2 by converting it to the more

friendly molecules, H2O and O2. An added benefit of these reactions is that they reduce the number of protons. After high oxalate secretion, the pH would probably be quite low, approaching 2.0. The catalytic optima for most endoglucanases, on the other hand is closer to pH 4.0 or 5.0 (Baldrian and Valásková 2008). Thus, by co-regulating gene ex- pression of the aryl alcohol and copper radical oxidases with catalase, we propose that the brown rot fungi have adapted H2O2 forming oxidases that were used by ancestral white

rot fungi for lignin degradation (in combination with manganese peroxidase or lignin

peroxidase), to a new role of removing laccase substrates and raising the pH, as the

brown rot fungus switched from Fenton-mediated decay to enzymatic hydrolysis of the

wood polysaccharides. The observed 4x increase in expression of an oxalate decarboxy-

lase gene at the late time point is another important mechanism that contributes to the

change in pH, since the pKa of formic acid is 3.8.

The last set of genes that appeared to be up-regulated at the late time point had

putative functions for membrane lipid metabolism. There were at least 10 genes involved

in phospholipid catabolism that showed increased expression at the same time as genes

109

involved in biosynthesis of specific membrane lipid components. In particular, we ob- served increased expression of two squalene synthetase genes (24x and 11x), which sup- ply squalene and/or ergosterol for incorporation into membranes. High amounts of either have been shown to reduce membrane permeability to the passive diffusion of protons

(Haines 2001). This function is also consistent with the observed increase in expression of a gene for cyclopropane fatty acyl -phospholipid synthase (5x). In Escherichia coli, membrane cyclopropane fatty acids were found to enhance survival when cells were grown in low pH environments (Brown et al. 1997) or exposed to acid shock (Chang and

Cronan 1999). The role of $-12 fatty acid desaturase is less clear because an increase in unsaturated fatty acids in membranes is typically associated with cold tolerance (Suutari and Laakso 1994). Co-expression of a $-12 fatty acid desaturase gene with genes for acid tolerance, however, does suggest importance of linoleic acid as a cell membrane phospholipid component during the late stages of decay. Our hypothesis is that in the switch from a decay strategy based on laccase-driven Fenton chemistry to one that in- volves enzymatic hydrolysis of the wood polysaccharides, the fungal cell wall became exposed to acid conditions. To protect itself, the fungus altered the chemical composition of the plasma membrane to reduce the harmful effects of low pH.

The scenario we propose is that during laccase-driven Fenton decay, the fungal cell wall, which is mainly composed of chitin, glucans, mannans, and glycoproteins

(Bowman and Free 2006), surrounds the hyphae during metal oxalate precipitation

(Connolly et al. 1996), and is substantial enough to prevent the plasma membrane from being exposed to the extremely low pH conditions caused by oxalate secretion. During this period of early decay, simple repairs were made to the glucan sheath, which explains why we only observed increased expression of the appropriate glucosidases, a mannosi-

110

dase, and a chitinase. The switch to enzymatic hydrolysis of the wood polysaccharides,

however, would require that the fungal cell wall become more porous to accommodate

the huge efflux of glycoside hydrolases and the huge influx of hexoses. Based on the in-

creased expression of a GH12 endo-1,3(4)-"-glucanase (21x), two GH16 endo-1,3(4)-"-

glucanase genes (4x, 5x), a GH79 "-glucuronidase (4x), a GH89 alpha-N-

acetylglucosaminidase, a GH55 glucan-1,3-"-glucosidase (3x), and a GH18 chitinase

(2.5x), this appeared to involve a major breakdown of the glycoproteins and polysaccha-

rides in the cell wall. Lost integrity of the sheath, however, would expose the plasma

membrane to low pH, which explained the observed increases in membrane lipid metabo-

lism. The types of changes in the lipid composition served as a protective measure to re-

duce acid shock and proton diffusion across the plasma membrane.

In conclusion, based on our differential gene expression study, we discovered

many new insights into the evolution of brown rot fungi and the complex network of

metabolic processes that regulate wood decay. These biological processes were tightly

coordinated to make nutrients quickly available while ensuring the survival of the fungus

during the harsh and changing conditions of extracellular digestion. Despite the tremen- dous advantage that system biology approaches have for accelerating our knowledge, the number of differentially expressed genes with limited annotations emphasize just how much more fundamental research needs to be done. Our hope, though, is that as we un- derstand more about the genes that regulate wood decay and copper tolerance, we will be able to rationally explore brown rot biochemistry for novel industrial processes like bio- mass to biofuel conversion and identify specific molecular targets for wood preservative development.

111

References Cited

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: 3389-402.

Anderson, E. 1946. The isolation of pectic substances from softwoods. J. Biol. Chem. 165: 233-40.

Arantes, V., Milagres, A.M., Filley, T.R., Goodell, B. 2011. Lignocellulosic polysaccharides and lignin degradation by wood decay fungi: the relevance of nonenzymatic Fenton-based reactions. J. Ind. Microbiol. Biotechnol. 38: 541-55.

Arantes, V., Qian, Y., Kelley, S.S., Milagres, A.M., Filley, T.R., Jellison, J., Goodell, B. 2009a. Biomimetic oxidative treatment of spruce wood studied by pyrolysis- molecular beam mass spectrometry coupled with multivariate analysis and 13C- labeled tetramethylammonium hydroxide thermochemolysis: implications for fungal degradation of wood. J. Biol. Inorg. Chem. 14: 1253-63.

Arantes, V., Qian, Y.H., Milagres, A.M.F., Jellison, J., Goodell, B. 2009b. Effect of pH and oxalic acid on the reduction of Fe3+ by a biomimetic chelator and on Fe3+ desorption/adsorption onto wood: implications for brown-rot decay. Int. Biodeterior. Biodegrad. 63: 478-83.

AWPA 2009. E22-09 Standard accelerated laboratory method for testing the efficacy of preservatives against wood decay fungi using compression strength. In: American Wood Protection Association Book of Standards. American Wood Protection Association, Birmingham, AL, pp. 425-31.

Baldrian, P., Valásková, V. 2008. Degradation of cellulose by basidiomycetous fungi. FEMS Microbiol. Rev. 32: 501-21.

Benjamini, Y., Hochberg, Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Soc. B 57: 289-300.

Bowman, S.M., Free, S.J. 2006. The structure and synthesis of the fungal cell wall. BioEssays 28: 799-808.

Brown, J.L., Ross, T., McMeekin, T.A., Nichols, P.D. 1997. Acid habituation of Escherichia coli and the potential role of cyclopropane fatty acids in low pH tolerance. Int. J. Food Microbiol. 37: 163-73.

Cantarel, B.L., Coutinho, P.M., Rancurel, C., Bernard, T., Lombard, V., Henrissat, B. 2009. The Carbohydrate-Active EnZymes database (CAZy): an expert resource for glycogenomics. Nucleic Acids Res. 37: D233-8. 112

Chang, Y.Y., Cronan, J.E., Jr. 1999. Membrane cyclopropane fatty acid content is a major factor in acid resistance of Escherichia coli. Mol. Microbiol. 33: 249-59.

Clausen, C.A., Green, F. 2003. Oxalic acid overproduction by copper-tolerant brown-rot basidiomycetes on southern yellow pine treated with copper-based preservatives. Int. Biodeterior. Biodegrad. 51: 139-44.

Clausen, C.A., Green, F., Woodward, B.M., Evans, J.W., DeGroot, R.C. 2000. Correlation between oxalic acid production and copper tolerance in Wolfiporia cocos. Int. Biodeterior. Biodegrad. 46: 69-76.

Cohen, R., Suzuki, M.R., Hammel, K.E. 2005. Processive endoglucanase active in crystalline cellulose hydrolysis by the brown rot basidiomycete Gloeophyllum trabeum. Appl. Environ. Microbiol. 71: 2412-7.

Connolly, J.H., Arnott, H.J., Jellison, J. 1996. Patterns of calcium oxalate crystal production by three species of wood decay fungi. Scanning Microscopy 10: 385-400.

Cowling, E.B. 1961. Comparative biochemistry of the decay of sweetgum sapwood by white-rot and brown-rot fungi. Madison, WI, Technical Bulletin No. 1258. U.S. Department of Agriculture, Forest Service, Forest Products Laboratory.

Curling, S.F., Clausen, C.A., Winandy, J.E. 2002. Relationships between mechanical properties, weight loss, and chemical composition of wood during incipient brown-rot decay. For. Prod. J. 52: 34-39.

Daniel, G. 1994. Use of electron microscopy for aiding our understanding of wood biodegradation. FEMS Microbiol. Rev. 13: 199-233.

Daniel, G. 2003. Microview of wood under degradation by bacteria and fungi. In: Wood Deterioration and Preservation: Advances in Our Changing World, eds. Goodell, B., Nicholas, D.D., Schultz, T.P. American Chemical Society, Washington, pp. 34-71.

Daniel, G., Volc, J., Filonova, L., Plihal, O., Kubatova, E., Halada, P. 2007. Characteristics of Gloeophyllum trabeum alcohol oxidase, an extracellular source of H2O2 in brown rot decay of wood. Appl. Environ. Microbiol. 73: 6241-43.

Daniel, G., Volc, J., Kubatova, E. 1994. Pyranose oxidase, a major source of H2O2 during wood degradation by Phanerochaete chrysosporium, Trametes versicolor, and Oudemansiella mucida. Appl. Environ. Microbiol. 60: 2524-32.

Dutton, M.V., Evans, S.C., Atkey, P.T., Wood, D.A. 1993. Oxalate production by basidiomycetes, including the white-rot species Coriolus versicolor and Phanerochaete chrysosporium. Appl. Environ. Microbiol. 39: 5-10.

113

Fernandez, I.S., Ruiz-Duenas, F.J., Santillana, E., Ferreira, P., Martinez, M.J., Martinez, A.T., Romero, A. 2009. Novel structural features in the GMC family of oxidoreductases revealed by the crystal structure of fungal aryl-alcohol oxidase. Acta Crystallogr. D 65: 1196-205.

Flournoy, D.S., Kent Kirk, T., Highley, T.L. 1991. Wood decay by brown-rot fungi: changes in pore structure and cell wall volume. Holzforschung 45: 383-88.

Fomina, M., Hillier, S., Charnock, J.M., Melville, K., Alexander, I.J., Gadd, G.M. 2005. Role of oxalic acid overexcretion in transformations of toxic metal minerals by Beauveria caledonica. Appl. Environ. Microbiol. 71: 371-81.

Gianfreda, L., Xu, F., Bollag, J.M. 1999. Laccases: a useful group of oxidoreductive enzymes. Bioremediation J. 3: 1-25.

Goodell, B., Jellison, J., Liu, J., Daniel, G., Paszczynski, A., Fekete, F., Krishnamurthy, S., Jun, L., Xu, G. 1997. Low molecular weight chelators and phenolic compounds isolated from wood decay fungi and their role in the fungal biodegradation of wood. J. Biotechnol. 53: 133-62.

Green, F., Clausen, C.A. 1999. Production of polygalacturonase and increase of longitudinal gas permeability in southern pine by brown rot and white rot fungi. Holzforschung 53: 563-68.

Green, F., Clausen, C.A. 2003. Copper tolerance of brown-rot fungi: time course of oxalic acid production. Int. Biodeterior. Biodegrad. 51: 145-49.

Green, F., Larsen, M.J., Winandy, J., Highley, T.L. 1991. Role of oxalic acid in incipient brown-rot decay. Mater. Organismen 26: 191-213.

Haines, T.H. 2001. Do sterols reduce proton and sodium leaks through lipid bilayers? Prog. Lipid Res. 40: 299-324.

Hammel, K.E., Kapich, A.N., Jensen, K.A., Ryan, Z.C. 2002. Reactive oxygen species as agents of wood decay by fungi. Enzyme Microb. Technol. 30: 445-53.

Hibbett, D.S., Donoghue, M.J. 2001. Analysis of character correlations among wood decay mechanisms, mating systems, and substrate ranges in homobasidiomycetes. Syst. Biol. 50: 215-42.

Highley, T.L. 1987. Effect of carbohydrate and nitrogen on hydrogen peroxide formation by wood decay fungi in solid medium. FEMS Microbiol. Lett. 48: 373-77.

Humar, M., Petric, M., Pohleven, F. 2001. Changes of the pH value of impregnated wood during exposure to wood-rotting fungi. Holz Als Roh-Und Werkstoff 59: 288-93.

114

Irbe, I., Andersons, B., Chirkova, J., Kallavus, U., Andersone, I., Faix, O. 2006. On the changes of pinewood (Pinus sylvestris L.) chemical composition and ultrastructure during the attack by brown-rot fungi Postia placenta and Coniophora puteana. Int. Biodeterior. Biodegrad. 57: 99-106.

Jarosz-Wilkolazka, A., Gadd, G.M. 2003. Oxalate production by wood-rotting fungi growing in toxic metal-amended medium. Chemosphere 52: 541-7.

Jensen, K.A., Jr., Houtman, C.J., Ryan, Z.C., Hammel, K.E. 2001. Pathways for extracellular Fenton chemistry in the brown rot basidiomycete Gloeophyllum trabeum. Appl. Environ. Microbiol. 67: 2705-11.

Jensen, K.A., Ryan, Z.C., Wymelenberg, A.V., Cullen, D., Hammel, K.E. 2002. An NADH:quinone oxidoreductase active during biodegradation by the brown-rot basidiomycete Gloeophyllum trabeum. Appl. Environ. Microbiol. 68: 2699-703.

Kerem, Z., Jensen, K.A., Hammel, K.E. 1999. Biodegradative mechanism of the brown rot basidiomycete Gloeophyllum trabeum: evidence for an extracellular hydroquinone-driven fenton reaction. FEBS Lett. 446: 49-54.

Kersten, P.J. 1990. Glyoxal oxidase of Phanerochaete chrysosporium: its characterization and activation by lignin peroxidase. Proc. Natl. Acad. Sci. USA 87: 2936-40.

Kim, Y.S., Wi, S.G., Lee, K.H., Singh, A.P. 2002. Cytochemical localization of hydrogen peroxide production during wood decay by brown-rot fungi Tyromyces palustris and Coniophora puteana. Holzforschung 56: 7-12.

Kirk, T.K., Ibach, R., Mozuch, M.D., Conner, A.H., Highley, T.L. 1991. Characterization of cotton cellulose depolymerized by a brown-rot fungus, by acid, or by chemical oxidants. Holzforschung 45: 239-44.

Koenigs, J.W. 1974. Hydrogen peroxide and iron: a proposed system for decomposition of wood by brown-rot basidiomycetes. Wood Fiber Sci. 6: 66-80.

Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L. 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10: R25.

Lee, J.G., Cho, S.P., Lee, H.S., Lee, C.H., Bae, K.S., Maeng, P.J. 2000. Identification of a cryptic N-terminal signal in Saccharomyces cerevisiae peroxisomal citrate synthase that functions in both peroxisomal and mitochondrial targeting. J. Biochem. 128: 1059-72.

115

Martinez, D., Challacombe, J., Morgenstern, I., Hibbett, D., Schmoll, M., Kubicek, C.P., Ferreira, P., Ruiz-Duenas, F.J., Martinez, A.T., Kersten, P., Hammel, K.E., Vanden Wymelenberg, A., Gaskell, J., Lindquist, E., Sabat, G., Bondurant, S.S., Larrondo, L.F., Canessa, P., Vicuna, R., Yadav, J., Doddapaneni, H., Subramanian, V., Pisabarro, A.G., Lavin, J.L., Oguiza, J.A., Master, E., Henrissat, B., Coutinho, P.M., Harris, P., Magnuson, J.K., Baker, S.E., Bruno, K., Kenealy, W., Hoegger, P.J., Kues, U., Ramaiya, P., Lucas, S., Salamov, A., Shapiro, H., Tu, H., Chee, C.L., Misra, M., Xie, G., Teter, S., Yaver, D., James, T., Mokrejs, M., Pospisek, M., Grigoriev, I.V., Brettin, T., Rokhsar, D., Berka, R., Cullen, D. 2009. Genome, transcriptome, and secretome analysis of wood decay fungus Postia placenta supports unique mechanisms of lignocellulose conversion. Proc. Natl. Acad. Sci. USA 106: 1954-59.

Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L., Wold, B. 2008. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5: 621-8.

Munir, E., Yoon, J.J., Tokimatsu, T., Hattori, T., Shimada, M. 2001. A physiological role for oxalic acid biosynthesis in the wood-rotting basidiomycete Fomitopsis palustris. Proc. Natl. Acad. Sci. USA 98: 11126-30.

Qiagen 2010. RNeasy Mini Handbook. Valencia, CA, Qiagen.

Ritschkoff, A.C., Ratto, M., Buchert, J., Viikari, L. 1995. Effect of carbon source on the production of oxalic acid and hydrogen peroxide by brown-rot fungus Poria placenta. J. Biotechnol. 40: 179-86.

Robinson, M.D., McCarthy, D., Chen, Y., Smyth, G.K. 2010. edgeR: differential expression analysis of digital gene expression data. Retrieved Jan. 2, 2011 from http://www.bioconductor.org/.

Robinson, M.D., Oshlack, A. 2010. A scaling normalization method for differential expression analysis of RNA-Seq data. Genome Biol. 11: R25.

Rowell, R.M. 2005. Handbook of Wood Chemistry and Wood Composites. Boca Raton, FL, CRC Press.

Sakai, S., Nishide, T., Munir, E., Baba, K., Inui, H., Nakano, Y., Hattori, T., Shimada, M. 2006. Subcellular localization of glyoxylate cycle key enzymes involved in oxalate biosynthesis of wood-destroying basidiomycete Fomitopsis palustris grown on glucose. Microbiology 152: 1857-66.

Schilling, J.S., Jellison, J. 2004. High-performance liquid chromatography analysis of soluble and total oxalate in Ca- and Mg-amended liquid cultures of three wood decay fungi. Holzforschung 58: 682-87.

Schilling, J.S., Jellison, J. 2005. Oxalate regulation by two brown rot fungi decaying oxalate-amended and non-amended wood. Holzforschung 59: 681-88.

116

Schilling, J.S., Jellison, J. 2006. Metal accumulation without enhanced oxalate secretion in wood degraded by brown rot fungi. Appl. Environ. Microbiol. 72: 5662-65.

Schwartz, R.L., Phoenix, T. 2001. Learning Perl. Cambridge, O'Reilly.

Schwarze, F.W.M.R. 2007. Wood decay under the microscope. Fungal Biol. Rev. 21: 133-70.

Shimizu, M., Yuda, N., Nakamura, T., Tanaka, H., Wariishi, H. 2005. Metabolic regulation at the tricarboxylic acid and glyoxylate cycles of the lignin-degrading basidiomycete Phanerochaete chrysosporium against exogenous addition of vanillin. Proteomics 5: 3919-31.

Sjostrom, E. 1993. Wood Chemistry Fundamentals and Applications. New York, Academic Press.

Stajich, J.E., Block, D., Boulez, K., Brenner, S.E., Chervitz, S.A., Dagdigian, C., Fuellen, G., Gilbert, J.G., Korf, I., Lapp, H., Lehvaslaiho, H., Matsalla, C., Mungall, C.J., Osborne, B.I., Pocock, M.R., Schattner, P., Senger, M., Stein, L.D., Stupka, E., Wilkinson, M.D., Birney, E. 2002. The BioPerl toolkit: Perl modules for the life sciences. Genome Res. 12: 1611-18.

Suutari, M., Laakso, S. 1994. Microbial fatty acids and thermal adaptation. Crit. Rev. Microbiol. 20: 285-328.

Suzuki, M.R., Hunt, C.G., Houtman, C.J., Dalebroux, Z.D., Hammel, K.E. 2006. Fungal hydroquinones contribute to brown rot of wood. Environ. Microbiol. 8: 2214-23.

Takao, S. 1965. Organic acid production by basidiomycetes. Appl. Microbiol. 13: 732-37.

Tang, J.D., Sonstegard, T., Burgess, S.C., Diehl, S.V. 2010. A genomic sequencing approach to study wood decay and copper tolerance in the brown rot fungus, Antrodia radiculosa. Int. Res. Group Wood Prot. Proc. IRG/WP 10-10720.

Valaskova, V., Baldrian, P. 2006. Degradation of cellulose and hemicelluloses by the brown rot fungus Piptoporus betulinus - production of extracellular enzymes and characterization of the major cellulases. Microbiology 152: 3613-22.

Vanden Wymelenberg, A., Gaskell, J., Mozuch, M., Sabat, G., Ralph, J., Skyba, O., Mansfield, S.D., Blanchette, R.A., Martinez, D., Grigoriev, I., Kersten, P.J., Cullen, D. 2010. Comparative transcriptome and secretome analysis of wood decay fungi Postia placenta and Phanerochaete chrysosporium. Appl. Environ. Microbiol. 76: 3599-610.

117

Vanden Wymelenberg, A., Gaskell, J., Mozuch, M., Splinter BonDurant, S., Sabat, G., Ralph, J., Skyba, O., Mansfield, S.D., Blanchette, R.A., Grigoriev, I., Kersten, P., Cullen, D. 2011. Significant alteration of gene expression in wood decay fungi Postia placenta and Phanerochaete chrysosporium by plant species. Appl. Environ. Microbiol. 77: 4499-507.

Varela, E., Tien, M. 2003. Effect of pH and oxalate on hydroquinone-derived hydroxyl radical formation during brown rot wood degradation. Appl. Environ. Microbiol. 69: 6025-31.

Wang, W., Huang, F., Mei Lu, X., Ji Gao, P. 2006. Lignin degradation by a novel peptide, Gt factor, from brown rot fungus Gloeophyllum trabeum. Biotechnol. J. 1: 447-53.

Wang, Z., Gerstein, M., Snyder, M. 2009. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10: 57-63.

Wei, D.S., Houtman, C.J., Kapich, A.N., Hunt, C.G., Cullen, D., Hammel, K.E. 2010. Laccase and its role in production of extracellular reactive oxygen species during wood decay by the brown rot basidiomycete Postia placenta. Appl. Environ. Microbiol. 76: 2091-97.

Weissman, Z., Berdicevsky, I., Cavari, B.Z., Kornitzer, D. 2000. The high copper tolerance of Candida albicans is mediated by a P-type ATPase. Proc. Natl. Acad. Sci. USA 97: 3520-5.

118

CHAPTER IV

CONCLUSIONS

The results of this study illustrate two points: (1) system biology approaches offer tremendous opportunities for identifying the genes that regulate the growth of living or- ganisms, and (2) gene sequencing technology has reached the point of being affordable and workable for a typical academic research laboratory. A few historical milestones ac- centuate just how far the field of genomics has advanced. Consider that man landed on the moon in 1969. The sequence of the human genome, on the other hand, was not com- pleted until 2003. The project took 13 years to complete, cost about $200 M, and re- quired the concerted effort of 20 sequencing centers around the world, cranking out data

24-7 (Collins et al. 2003). Now, in 2011, a single student has been able to demonstrate that a fungal genome can be sequenced in less than 2 years, at a cost of $6000, through a collaboration forged between four research groups, three at Mississippi State (the De- partment of Forest Products, the Institute for Genomics, Biocomputing and Biotechnol- ogy, and the Department of Computer Science and Engineering) and one at USDA ARS

(the Bovine Functional Genomics Laboratory). Granted, the human genome is about

100x larger than the Fibroporia radiculosa genome, contains about 2.5x more genes, and has other structural complexities that are absent from the genomes of fungi, nevertheless, these facts illustrate how the economic, labor, and temporal scale of the science has changed in the past decade.

119

This dissertation also demonstrates that with an additional 1.5 years and $7000, the genes that control biological processes like wood decay and preservative tolerance can be identified. Thus, one can quickly gain an understanding of how genes work to- gether to regulate the complex network of metabolic pathways that define how cells grow and respond to their environment. When this dissertation was begun, very little was known about the genetic mechanisms that control brown rot decay and copper tolerance.

Upon completion of this dissertation, however, the sequence of 9262 genes were de- scribed for the brown rot fungus F. radiculosa. Moreover, by monitoring gene expres- sion, 108 genes were identified as playing key roles in cellular processes that were regu- lating decay of preservative-treated wood.

These genes represent new discoveries that help us understand, for example, that copper tolerance in brown rot fungi is controlled by two separate biological processes:

(1) increased expression of isocitrate lyase for the production of oxalate that transforms copper into an inert precipitate, and (2) a copper-resistance ATPase pump that prevents intracellular levels of copper from becoming toxic. Metabolism, though is a network of interconnected pathways that became crystal clear from the transcriptomic study. The gene expression results tied increased oxalate production with increased energy produc- tion and laccase-driven hydroxyl free radical production. It became apparent that energy was a critical component for battling the stress induced by the preservative-treated wood and laccase was driving the chemical scission of cellulose by Fenton chemistry. These are just a few examples of the linked discoveries that were made when an organism was studied as an entire system as opposed to its dissected parts.

This comprehensive knowledge, in turn, has significant impacts that are not just limited to the field of wood protection and wood preservative development. Comparative

120

genomics can reveal how a species and its lifestyle evolved. This is because DNA is not only a molecular clock, but also controls the biological traits that control speciation.

Functional genomics can also reveal novel genes that control unusual biochemical reac- tions or biological processes. This dissertation only found functional annotations for

58% of the genes that make up the F. radiculosa genome, and discussed only 108 of the

917 differentially expressed genes. Obviously, there is much more to the brown rot ge- nome than we currently understand or have yet to explore for industrial exploitation.

Free radical production for cellulose scission is unique to the brown rot fungi and gives them quick access to the rich source of sugars that make up wood. Only time will tell if the genes that control free radical production could also be manipulated for the sustain- able production of biofuel from lignocellulosic biomass. And why not? The unique bio- chemical diversity of fungi has already been tapped for the production of drugs (antibiot- ics, immune suppressants, ergot alkaloids, and statins), food and beverages, bleaching and delignification of woody pulp, and bioremediation of toxic waste. Clearly, fungal genomes will continue to be a tremendous genetic resource that will help us improve our existence on earth.

121

References Cited

Collins, F.S., Morgan, M., Patrinos, A. 2003. The human genome project: lessons from large-scale biology. Science 300: 286-90.

122