Genomic and Gene Expression Studies of

Coprinopsis cinerea by

5,Serial Analysis of Gene Expression (SAGE)

CHENG,Chi Keung

A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Philosophy in Molecular Biotechnology

• The Chinese University of Hong Kong APRIL 2008

The Chinese University of Hong Kong holds the copyright of this thesis. Any person(s) intending to use part of whole of the materials in the thesis in a proposed publication must seek copyright release from the Dean of the Graduate School. 旦 尊 : s : 丨 ^ J

y : . j 5

Thesis/ Assessment Committee

Professor FUNG Ming Chiu (Chair) Professor KWAN Hoi Shan (Thesis Supervisor) Professor TSUI Kwok Wing (Committee Member) Abstract

Coprinopsis cinerea, the inky cap , is a model organism for studying developmental processes in basidiomycetous fungi. Short life cycle, ease in cultivation and fruiting in laboratory and availability of the drafted genome assembly have much facilitated its genetic and molecular studies.

Our long term goal is to understand the molecular events involved during mushroom fruiting body initiation and development which are still unclear. The

current knowledge of the C. cinerea genome contains no data on expression patterns

and transcription start sites (TSS) at a genome-wide level. Therefore, two 5' Serial

Analysis of Gene Expression (SAGE) libraries, employing a ditag strategy and the

sequencing capacity of the GS20 genome sequencer, were constructed from the

dikaryotic mycelial and primordial stages of C. cinerea to generate �250,00 0tag

sequences for comprehensive transcriptome and TSS analyses, thereby* providing

possible insights into the fruiting mechanisms and contributing to a better annotation

of the genome and our understanding of transcription regulation.

A total of -38,000 and �51,00 0unique tags were isolated from mycelium and

primordium respectively. Approximately 80% of these tags were mapped to single

position on the C. cinerea genome, and from which �35-37 %were mapped to the

putative 5'-untranslated region (UTR) of annotated genes, representing an estimate

of �10,00 an0 d �15,00 genuin0 e TSS for each developmental stage. The TSS data is

crucial for defining transcriptional units and analyses of the promoter and upstream

regulatory elements. The existence of alternative TSSs, potential antisense transcripts

and unannotated novel genes were identified and possible errors in current genome

annotations were also revealed.

By considering the tags mapped to the putative 5’-UTR, the expression level of

ii -6,700 genes was evaluated, and from which �1,00 0differentially expressed genes were identified by the Fisher Exact Test. This suggests a significant switching of transcriptomes during the transition from mycelium to primordium. Investigation into these genes revealed an up-regulation of the whole synthesis machinery and signal transduction system. Potential players in light signaling, sensing of nutrient depletion and membrane structure alteration, which have been suggested to induce fruiting body initiation and development, were also recognized. This set of differential expression data can be compared to the expression profiles studied in other fungi. The 5' SAGE data serve as a valuable platform for understanding gene functions and various biological processes including the molecular events underlying

the fruiting process in .

iii 摘要

Coprinopsis cinerea,亦稱鬼傘蘑薛,是一種被廣泛地用作硏究擔子菌類薛

的發展進程的模範生物。它的生命週期短、易於在實驗室中培養和出蘇,並且

擁有基因組的序列草圖,使其在遺傳和分子生物學的硏究變得容易。

我們的長遠目標,是硏究在分子水平上子實體的形成及發展的問題,然而

目前這些問題仍然不清楚。當前,C cinerea的基因組並未含有基因表達的數據

和在基因組水平上的轉錄起始位點(TSS)�因此,我使用ditag的方法以及應用

了 GS20基因組測序器龐大的測序能力,建立了兩個5’端已表達基因的連續分析

(SAGE)的資料庫,並分別從C. cinerea的雙核菌絲體和原基體中獲得〜250,000

個標縫序列。這些序列可用於基因表達圖譜和轉錄起始位點的分析,從而能夠

洞悉到出薛的機制,並有助於更全面地E釋基因組和了解轉錄的調控。

由菌絲和原基體中,我們共獲得了〜38000及〜51000個獨立的標籤。大約

80%的標籤被分派到C dn^M基因組的單一位置,當中〜35-37%被分派到已註

解基因的假定的5'非編碼區,即代表在每個發展階段中,大約〜10000及〜15000

個真正的轉錄起始位點被識別出來。這些轉錄起始位點的數據對界定轉錄單位

以及分析啓動子和上游調控元件均相當重要。除此之外,本硏究亦發現了交替

TSS�潛在的反義轉錄及未被註解的新基因的存在,同時,目前基因組註解可能

出現的錯誤也被展露出來。

考慮到被分派到5’非編碼區的標籤,我評價了〜6700個基因的表達水平,

並從中使用Fisher Exact Test鑑定了〜1000個差異表達的基因。這展視了由菌絲

體發展至原基體期間,開展了一個不同的基因表達圖譜。硏究這些基因,我發

現整個蛋白質合成機制及信號轉導系統的上調。同時亦識別出一些被認爲會觸

發子實體的形成及發展的分子,例如可能參與光信號的傳遞,感應養分枯竭和

壁膜結構改變的分子。這套基因差異表達的數據更可以與其他真菌的基因表達

圖譜作硏究比較。所以,本硏究對了解各蔽類的基因功能及其生長過程,包括

整個子實體的發展背後的硏究,提供了一個寶貴的平台。 iv Acknowledgements

I would like to express my sincere gratitude to my supervisor Professor

Hoi-shan Kwan for his inspirations, guidance and patience on my research. His valuable comments had enlightened me a lot.

I would like to thank my thesis committee, Prof. Ming-chiu Fung, Prof.

Kwok-wing Tsui and Prof. Wing-kin Yip, for their precious time and comments on my thesis.

Special thanks is dedicated to Mr. Tommy Au for his endless support and

comments on the bioinformatics of the thesis. I would like to thank Miss Winnie

Chum for her support and SAGE data. I would also like to sincerely thank all other

377 labmates, Astley Chu, Anna Yip, Carol Szeto, Liam Lee, Iris Kwok, Chris Sham,

Crystal Lee and Jackie Wong, for their discussions, suggestions and friendships. I am

grateful to Mr. Chi-chiu Li for his technical supports.

Finally, I would to thank my family, my dearest friends for their love and

everyone who had supported me during these years.

V Abbreviations

Abbreviations used in this thesis without definition include:

bp Base pair (s) cAMP Cyclic adenosine monophosphate cDNA Complementary DNA

DEPC Diethylpyroc arbonate

DIG Digoxigenin

DNA Deoxyribonucleic acid

EDTA Ethylene diamine tetraacetic acid

kb kilobase (s)

LB Luria broth

MOPS 3-[N-Morpholino] propanesulfonate

mRNA Messenger RNA

PGR Polymerase Chain Reaction

RNA Ribonucleic acid

RNase Ribonuclease

rRNA Ribosomal RNA

SDS Sodium dodecyl sulfate

TBE Tris-boric acid-EDTA

Tris Tris(hydroxymethyl) animomethane

vi Table of Contents

English Abstract ii Chinese Abstract iv Acknowledgements v Abbreviations vi Table of Contents vii List of Figures x List of Tables xii

Chapter 1 Literature Review

1.1 Introduction & 1 1.2 Life cycle and morphology 1 1.3 Growth requirements 4 1.3.1 Nutritional requirements 4 1.3.2 Environment factors 5 1.4 Fruiting body development in Coprinopsis cine re a 6 1.4.1 Physiology of the fruiting process 6 1.4.2 Other studies related to the fruiting process 7 1.5 Other biological studies in Coprinopsis cinerea 8 1.5.1 Meiosis studies 9 1.5.2 Mating analyses 10 1.5.3 Peroxidase production 11 1.5.4 Transformation and gene silencing 12 1.5.5 Other studies 13 1.6 C cinerea genome project 13 1.7 Transcriptome analyses 14 1.7.1 Serial Analysis of Gene Expression (SAGE) 14 1.7.2 Analyzing the 5' end of transcripts 16 1.7.3 Mapping of SAGE tags to the genome 19 1.8 High throughput sequencing 20 1.8.1 Pyrophosphate sequencing 20 1.8.2 Application of pyrosequencing 21 1.9 Aims of project 22

vii Chapter 2 5,Serial Analysis of Gene Expression (5,SAGE) from mycelial and primordial stages of C. cinerea

2.1 Introduction 24 2.2 Materials and Methods 29 2.2.1 5' SAGE libraries construction 29 2.2.1.1 Mushroom mycelium and primordium cultivation 29 2.2.1.2 RNA extraction 29 2.2.1.3 Isolation of mRNA 30 2.2.1.4 cDNA synthesis 31 2.2.1.5 Mmel digestion and Polyacrylamide gel electrophoresis 32 2.2.1.6 Formation and amplification of ditag 33 2.2.2 Identification of lOObp ditag 34 2.2.3 High throughput pyrosequencing 35 2.2.4 Tags extraction from ditags 35 2.2.5 Genome mapping and annotation 36 2.3 Results 37 2.3.1 5' SAGE libraries construction 37 2.3.1.1 cDNA synthesis 37 2.3.1.2 Mmel digestion and ditag formation 38 2.3.2 Identification of lOObp ditags 39 2.3.3 High throughput pyrosequencing 40 2.3.4 Tags extraction from ditags 41 2.3.5 Genome mapping and annotation 42 2.4 Discussion 46 2.4.1 5,SAGE libraries construction 46 2.4.2 Tags extraction and genome mapping 46 2.4.3 Observations based on the genome mapping data 48

Chapter 3 Validation of expression patterns of 5,SAGE libraries and analysis of differentially expressed genes

3.1 Introduction 55 3.2 Materials and Methods 58 3.2.1 Identification of housekeeping gene by Northern Blot analysis 58 3.2.1.1 RNA fractionation by formaldehyde gel electrophoresis 58 3.2.1.2 Transfer ofRNAs 58 3.2.1.3 Probe preparation 59

viii 3.2.1.4 Hybridization, Stringency washes and signal detection 60 3.2.2 Quantitative real-time PGR 61 3.2.2.1 cDNA synthesis from 2 developmental stages 61 3.2.2.2 Primer design and verification 62 3.2.2.3 Real time PGR reaction and data analysis 65 3.2.3 Gene expression level comparison 65 3.3 Results 67 3.3.1 Identification of housekeeping gene by Northern Blot analysis 67 3.3.2 Quantitative real-time PGR analysis 71 3.3.3 Gene expression level comparison 78 3.4 Discussion 126 3.4.1 Validation of 5' SAGE libraries 126

3.4.2 Analysis of highly and differentially expressed genes 127

Chapter 4 General discussion 135

References 144

Appendix 161

ix List of Figures

, Figure 1.1 Life cycle of Coprinopsis cinerea grown on artificial YMG 3 medium. Figure 1.2 Fruiting body development of Coprinopsis cinerea in a 4 12-h-light/dark regime (Kues, 2000). Figure 2.1 A schematic diagram illustrating the major procedures in 5' 26 SAGE. Figure 2.2 An overview of the C. cinerea genome annotations website, 28 using galactose binding lectin (eg 12) as an example to illustrate various features available for analysis. Figure 2.3 Agarose gel electrophoresis of double-stranded cDNA 37 synthesized from mRNA isolated from mycelium and stage 1 primordium of C. cinerea. Figure 2.4 Polyacrylamide gel electrophoresis of the 50bp fragments 38 after Mmel digestion of mycelial and primordial cDNA. Figure 2.5 Agarose gel electrophoresis of lOObp ditags after low-cycle 39 PGR amplification. Figure 2.6 Sequence information of one of the cloned lOObp ditags 40 from the mycelial stage. Figure 2.7 Read length distribution of all sequence reads for mycelial 41 and primordial ditags. Figure 2.8 Distribution of tags mapped to the putative 5'-UTR 45 upstream to the start codon. Figure 2.9 Alternative transcription start sites in the gene encoding for 50 proteosome subunit beta type 6 in the primordial stage. Figure 2.10 Tags in anti-sense orientation found in the coding region of 52 the gene encoding histone H4 in both mycelial and primordial stages. Figure 2.11 Possible errors in predicting the exon-intron boundary for 54 the gene encoding 60S ribosomal protein P2. Figure 3.1 Northern blotting of the transcripts of Cc. G6PDH at six 68 different developmental stages of C. cinerea. Figure 3.2 Northern blotting of the transcripts of Cc.Pma at six 69 different developmental stages of C. cinerea. Figure 3.3 Northern blotting of the transcripts of Cc.Ras at six 70 different developmental stages of C. cinerea. Figure 3.4 Northern blotting of the transcripts of encoded by the 72 clathrin coat assembly protein gene at mycelial and

X primordial stages of C. cinerea. Figure 3.5 Verification of 12 sets of real-time PCR primers on 1.5% 73 agarose gel electrophoresis. Figure 3.6 Melting curve analysis for 12 real-time PCR products. 74 Figure 3.7 The relative expression level ratio of (a) Thioredoxin, (b) 75 tetraspannin, (c) Vipl protein, (d) subtilisin N-terminal region, from real-time PCR analysis. Figure 3.8 The relative expression level ratio of (a) ATP synthase 76 oligomycin sensitivity conferral protein, (b) 60S ribosomal protein L34, (c) Ubiquitin fusion protein, from real-time PCR analysis. Figure 3.9 The relative expression level ratio of (a) 40S ribosomal 77 protein SI4,(b) Prohibitin PHBl, (c) ATP synthase delta chain, (d) Basic leucine zipper and W2 domain 2, from real-time PCR analysis. Figure 3.10 Gene Ontology (GO) annotations 124 (http://www.geneontology.org/ GO.downloads.shtml) for 139 differentially expressed genes in the mycelial stage visualized by the freeware Gene Ontology Browsing Utility (GOBU) (http://gobu.iis.sinica.edu.tw) Figure 3.11 Gene Ontology (GO) annotations 125 (http://www.geneontology.org/ GO. downloads, shtml) for 484 differentially expressed genes in the primordial stage visualized by the freeware Gene Ontology Browsing Utility (GOBU) (http://gobu.iis.sinica.edu.tw).

xi List of Tables

Table 1.1 General tag-based approaches for transcriptome analyses 18 developed from the SAGE technology

Table 2.1 Summary of raw data obtained from GS20 pyrosequencing. 40

Table 2.2 Summary of the tag extraction and genome mapping. 43

Table 2.3 Gene-associated positions of the uniquely matched tags to 43 the C. cinerea genome

Table 3.1 Primers for amplification of the potential housekeeping 60 gene.

Table 3.2 Primers for quantitative real time PGR analysis. 63

Table 3.3 Threshold cycle detected from real-time PGR for the 72 clathrin coat assembly protein gene.

Table 3.4 Summary of the top 150 most highly expressed genes in the 80 mycelial stage.

Table 3.5 Summary of the top 150 most highly expressed genes in the 86 primordial stage.

Table 3.6 Summary of the 207 (out of 358) differentially expressed 94 genes with protein homologues in the mycelial stage.

Table 3.7 Summary of the 561 (out of 696) differentially expressed 102 genes with protein homologues in the primordial stage.

Table A Summary of the 151 (out of 358) differentially expressed 161 genes without protein homologues in the mycelial stage

Table B Summary of the 135 (out of 696) differentially expressed 164 genes without protein homologues in the primordial stage

xii Chapter 1 Literature Review

1.1 Introduction and Taxonomy

Coprinopsis cinerea (Schaeff. ex Fr.) Gray, commonly known as the inky caps, is one of the model organisms widely used to study developmental processes in homobasidiomycetous fungi. It has a haploid genome size of 37.5Mb distributed in

13 chromosomes. Members of the genus Coprinopsis are defined as saprophytic mushrooms whose gills and often the entire cap autodigests at maturity, releasing an inky black fluid to the ground (Arora, 1986). C. cinerea is classified in family

Coprinaceae, order , subphylum Homobasidiomycetes and phylum

Basidiomycota.

C. cinerea is found worldwide growing on heaps of horse dung, rotten straw and vegetable refuse. It also grows and fruits well on artificial medium (Madelin,

1956). Although C. cinerea has only limited edible value, it is used to understand the

developmental process in many edible basidiomycetes, most of which cannot

produce fruiting bodies in the laboratory and are not readily accessible to genetic

approaches (Chang et al, 1993). Its relatively short life cycle has also facilitated

genetic studies.

In recent years, C. cinerea has been widely tested for peroxidase (CiP)

production for treatment of waste water contaminated with phenolic compounds

(Mao et al., 2006; Ikehata et aL, 2004). Mycelial broth cultures (Han et al.,1999)

and lectins (Wang et al., 1998) purified from certain Coprinopsis species were also

found to exhibit anti-tumor activity.

1.2 Life cycle and morphology

C. cinerea is a heterothallic basidiomycetes. Its life cycle can be divided into

1 six main developmental stages: basidiospore, mycelium, stage 1 primordium, stage 2 primordium, young fruiting body and mature fruiting body (Klies, 2000) (Figure 1.1,

Figure 1.2).

The life cycle starts when the haploid binucleate basidiospores released from

fruiting bodies of dikaryons germinate to form a monokaryotic mycelium. There are

two main types of mycelium: the infertile monokaryon and the fertile dikaryon

(Casselton, 1995). Only the fertile dikaryons can produce fruiting body, and while

the infertile monokaryons cannot, they can still grow to produce mitotic aerial

spores (oidia) and mitotic submerged spores (chlamydospores). In general, the

mycelium of dikaryons tends to grow faster, and has a denser, more protuberant and

conspicious aerial portion than their monokaryotic counterparts (Buller, 1931).

Dikaryons are formed upon fusion of monokaryons of compatible mating types,

either by hypha-hypha fusion or by fusion of hyphae with germinating or resting

oidia (Casselton and Econoumou, 1985). During fusion, nuclei enter the mycelium

of the opposite mating type and migrate through the hyphae until they reach the

hyphal tip cell. Specialized clamp cells (or hook cells) are formed during

synchronous division of the paired nuclei. When nutrients become depleted and

favorable environmental conditions are available, some areas of the vegetative

hyphae aggregate to form small communities known as the hyphal knots, which

signify the initiation of the fruiting process (Kiies et al., 1998).

Following the appearance of the hyphal knots, various environmental factors,

light in particular, are crucial for the process to proceed. The hyphal knots grow into

the globose fruiting body initials and then quickly pass through stage 1 and stage 2

primordium, immature and eventually mature fruiting bodies under appropriate

light/dark cycles (Ballou and Holton, 1985). If conditions are favorable, it takes only

a few days to reach maturation. The mature fruiting body can be basically divided

2 into the and cap structures, in which the stipe is a hollow cylinder whereas the -containing gills are found underneath the cap. When the fruiting body matures, the cap structure further expands and is subsequently autolyzed, accompanied by the release of a brown to black ink carrying the basidiospores to the culture medium (Buller, 1931). The life cycle of C. cinerea is renewed as the spores germinate.

Mature fruiting body Basidiospores Mycelium

fel [] Immature fruiting body 國 All:。-*,J 乂 Stage 2 primordium : Stage 1 primordium Figure 1.1 Life cycle of Coprinopsis cinerea grown on artificial YMG medium. The life cycle starts as the basidiospores germinate to give dikaryotic mycelium, which then grows to stage 1 primordium, stage 2 primordium, immature fruiting body and eventually mature fruiting body. The cycle is renewed as basidiospores from mature fruiting body germinate again (Kties, 2000).

3 Light Light Light Light

\ \ \ \ Q -i::^ ^^ II^L.. ...911 i \i 1 M 0.2 ~2 2-6 6-10 1-2cm 1.5-4.5cm 4-7cm mm mm mm mm

Hyphal Initials Stage 1 Stage 2 Immature fruiting bodies Mature fruiting bodies knots Primordia

Figure 1.2 Fruiting body development of Coprinopsis cinerea in a 12-h-light/dark regime (Kues, 2000). Light is essential for the development of fruiting body initials, stage 1 primordium, stage 2 primordium and

immature fruiting body.

1.3 Growth requirements of C. cinerea

1.3.1 Nutritional requirements

Despite its limited edible value, C, cinerea is important in mushroom developmental studies as many edible basidiomycetes cannot be grown and/or produce fruiting bodies in the laboratory. In the natural environment, horse dung is the main substrate of C. cinerea, but it grows and fruits well on various synthetic culture media as well. Also, it was shown by Hanai et al. (2004) that supplementation of rice husks to cultures of C. cinerea can stimulate mycelial growth in a dose dependent manner.

In the laboratory, mycelium cultivation and fruiting process of C. cinerea is typically performed on YMG medium containing a mixture of yeast extract, malt extract and dextrose. Malt extract is very high in carbohydrate content, especially maltoses, and provides carbon, protein and nutrient sources to support the growth of the mushroom. Yeast extract provides B-complex vitamins and cofactors required for growth and additional sources of nitrogen and carbon. Moreover, dextrose serves

4 as a source of fermentable carbohydrate. However, fruiting body formation is negatively affected by increased concentrations of C (essentially glucose and glucose analogues) and N sources (Kues, 2000),while addition of free ammonium to competent mycelia can stimulate the fruiting process (Morimoto et al., 1981).

1.3.2 Environmental factors

Apart from the nutritional factors, environmental conditions including light, temperature and humidity also play critical roles in deciding which developmental pathways C. cinerea will enter. Among these factors, light is the most important in that it determines whether a dikaryon will form hyphal knots or oidiophores with oidia. Even after hyphal knots are formed, the presence of light will decide whether they can become fruiting body initials or proceed to sclerotia (Kiies, 2000). In fact, while the mycelium grows in darkness, fruiting body development is highly synchronized to alternating light-dark periods fixed by the normal day-night cycle

(Ballou and Holton, 1985).

The mycelium grows at a temperature of 37°C and the fruiting process occurs best at 25-28°C for most but not all strains (Lu, 1972). Usually, higher temperatures favor the development of oidia, and lower temperatures induce fruiting body initiation. Similar to some other mushrooms, the application of a cold shock can increase the absolute number of fruiting bodies on a culture medium (KUes, 2000).

In addition, high humidity (>60%) is required for fruiting body initiation and maturation, whereas sclerotia and chlamydospores are observed at low humidity

(Moore, 1981). As C. cinerea is an aerobic , well-circulated flesh air and

good ventilation during fruiting is essential and accumulation of CO2 should be

prevented or else fluffy hypahe can grow from the stipe base of the primordium (P.

Pukkila, personal communication). 5 1.4 Fruiting body development in Coprinopsis cinerea

1.4.1 Physiology of the fruiting process

The fruiting process is the most complex, yet rapid developmental event in the life cycle of C. cinerea. When nutrients are depleted, the relatively loose mesh of free, undifferentiated mycelium undergoes a drastic change to form a compact multihyphal structure with many different cell types holding each other through hyphal-hyphal interactions, known as the fruiting body (Moore, 1995; Moore, 1996).

Under ideal conditions, the fruiting process can be completed in as short as a few days following the first sign of fruiting (Kuhad et al, 1987; Lu, 1974).

Among the many environmental signals influencing fruiting, light/dark periods are the most important. In fact, fruiting body development is highly synchronized to the normal day-light rhythm cycle (Ballou and Holton, 1985; Kamada et al., 1978).

Hyphal knots (�0.2m min size) are formed in the dark (Boulianne et al., 2000).

Light then induces the formation of globose fruiting body initial (<2mm in size), which is regarded as the first fruiting body-specific stage (Moore, 1981) and are the first structure showing clear histological differentiation such as polarization of the undirected hyphal growth (Matthews and Niederpmem, 1973). The veil, pileus

trama, primary gill and stipe formation have also started (Chiu and Moore, 1990).

Again, light is essential to induce the development of stage 1 primordium

(2-6mm in size) from the fruiting body initials (Kamada et al., 1978, Kuhad et al.,

1987). At this stage, gills, which are vertical plates arranged radially around the stipe,

are already well-developed and basidia can also be found (Kuhad et al., 1987).

Basidia are structure in which basidiospores will be formed later and during further

development of the stage 1 primordium, the basidia change from a cylindrical to a

club-shaped structure (Raju and Lu, 1970). Karyogamy marks the end of this stage

(Lu, 1974). At this point, tissue development in the cap is completed.

6 In Stage 2 primordium (6-10mm), meiosis starts within the basidia and the gills begin to separate. From this stage, it takes approximately 1-2 days for development into mature fruiting bodies (Kuhad et al., 1987). Four main processes are involved during maturation: stipe elongation (Buller, 1931), basidium maturation, pileus expansion and autolysis of the pileus.

Basidia are the sole cells in C. cinerea that express developmental commitment

(Chiu and Moore, 1988). One remarkable feature of C. cinerea is the natural synchrony of karyogamy and meiosis in that in a given cap, 60-85% of all basidia are always in the same developmental phase at all stages of meiosis (Kiies, 2000).

Usually, in a single basidium, there are eight basidiospores and each mushroom cap is able to deliver as many as ICT to 10^ spores.

Immature fruiting bodies are about 1.5-4.5cm tall while mature ones are usually

4-7cm. During fruiting body maturation, there are no further cell divisions and most changes in the cap shape are a result of cell expansion, as suggested by a considerable uptake of water and thus is probably osmotically driven (Ewaze et al.,

1978). After the cap fully expands, autolysis starts from the edge of the gill closest

to the stipe (Rosin and Moore, 1985). Autolysis is caused by cell wall degradation

by chitinases, proteases and glucanases (Hammad et al., 1993) and the process

produces a brown to dark ink, carrying the basidiospores to the culture medium.

1.4.2 Other studies related to the fruiting process

For years, researchers have been very interested to understand the underlying

principles of the process, as cultivation of mushrooms may be improved by an

increased knowledge in the fruiting mechanisms, conditions and techniques.

Structural factors are believed to be mediating hyphal-hyphal interaction which

is a crucial process for the fruiting process (Boulianne et al., 2000). The cgll and

7 cgl2 lectins received much attention among the many molecules that have been studied. It was found that higher fungi can have more complex type polysaccharides on their surface, thereby suggesting that hyphal interactions may be mediated by lectins, which is a class of oligosaccharide-binding (Guillot and Konska,

1997). The two proteins are highly differentially regulated during fruiting body formation, with cgl2 being expressed in early stages of fruiting body development, and is maintained until maturation. On the other hand, cgll is specifically expressed in primordial and mature fruiting bodies (Cooper et al; 1997). Interestingly, although showing binding specificity towards P-galactosides, the two lectins share no sequence homology with other known fungal lectins but rather show homology to the family of galectins. This was the first reported case of galectins being found outside the animal kingdoms (Cooper et al” 1997).

In addition to the cgll and cgll genes, other genes like ichl gene (Muraguchi and Kamada, 1998) which is responsible for the hymenium-bearing pileus formation and the elnS gene (Arima et al., 2004), which involves in fruiting body morphogenesis by encoding a putative with a glycosyltransferase domain, had also been studied.

1.5 Other biological studies in Coprinopsis cinerea

Over the past few decades, various molecular techniques have been applied to

C. cinerea to generate large amounts of data on its genomics and functional

genomics. Indeed, C. cinerea has played a very important role in fungal researches.

It is one of the model mushrooms and many of its coordinated developmental

processes such as gill formation, cap expansion, autodigestion, spore discharge and

stalk elongation are not observed in unicellular fungi. Also, being a multicellular

fungi, genomic analysis will reveal whether the genes responsible for

8 multicellularity of C. cinereus are the progenitors of those in other multicellular organisms or are innovations simply restricted to the fungi. It may as well provide

some clues on the evolution or development of multicellularity within the fungal

kingdom (Broad Institute, 2007).

Studies of C. cinerea are mainly focused on the areas of fruiting body

development, meiosis, mating factors, production and gene silencing.

1.5.1 Meiosis studies

Karyogamy and meiosis is highly associated with the fruiting process and is

one of the remarkable features of C. cinerea for its natural synchrony (Klies, 2000).

Such feature and the fact that chromosomes of C. cinerea can be examined under

light microscopes have facilitated molecular studies of meiosis, and actually

research of meiotic development in C. cinerea had started more than two decades

ago (Pukkila et al., 1984). Fluorescence in situ hybridization (FISH) had been

applied to examine homologous pairing during synchronous meiosis and confirmed

that homologous pairing occurs rapidly after karyogamy and at 6 hours

post-karyogamy essentially all meiotic nuclei are in pachytene (Li et al., 1999). Lu

(2000) had also studied the effect of light on meiosis progression and concluded that

meiosis in C. cinerea is controlled by light/dark cycles and that light is essential to

propel basidia into karyogamy and its intensity determines the timing of meiotic

events.

In addition to studies at chromosomal level, several including DNA

polymerase alpha-primase complex (Namekawa et al., 2003), DNA ligase I

(Namekawa et al., 2003), DNA ligase IV (Namekawa et aL, 2003), the

LIM15/DMC1 homolog (Nara et al.’ 1999) and the rad9 gene (Seitz et al., 1996) in

C. cinerea have also been characterized.

9 1.5.2 Mating type analyses

Most mushroom species possess two mating-type loci controlling their breeding. These mating loci contain genes that are genetic regulators of sexual

compatibility and development, and as seen in C. cinerea, sometimes also of asexual

development (Hiscock and Kiies, 1999). Since mushroom production depends

largely on the quality of the spawn used for inoculation, spawns conferring

resistance to certain diseases and producing high-quality fruiting bodies under

standard growing conditions are necessary (Kothe, 2001). For these reasons, C.

cinerea has long been employed to investigate the molecular basis of mating for its

easy cultivation and sexual propagation on defined medium.

The mating type genes control many conserved processes such as self-nonself

recognition and nuclear migration. The components' structure of the mating loci and

their genetic arrangements vary greatly in different fungi. In C. cinerea, it is

previously discovered that the A-mating loci contain three to four functionally

redundant gene pairs encoding two types of homeodomain transcription factors

known as HDl and HD2 (Casselton and Olesnicky, 1998). On the other hand, the

B-mating loci contain large families of pheromones and G-protein-coupled

pheromone receptors system, and whose functions are to confer vast numbers of

different mating types (O'shea et al., 1998). Two haploid monokaryons with

compatible alleles at both mating-type loci can mate together to produce a dikaryon

through their coordinated regulation pathways (Kamada, 2002). This complex

mating system, unlike the one in ascomycetes, involves the novel one-to-many

specificity in both pheromone receptors and homeodomain proteins interactions

(Kronstad and Staben, 1997).

A recent research has revealed many more members of the B-mating genes by

molecular analyses of strains collected worldwide (Riquelme et al., 2005). These

10 newly identified alleles are grouped based on sequence homology rather than by positions, thus pinpointing the complex evolutionary process giving rise to the

B-mating loci. It was also proved that alternative activation of the A and B

mating-type pathways activates different developmental processes during the mating

and fruiting process (Klies et ai, 2002).

1.5.3 Peroxidase production

Phenolic compounds, classified as pollutants and toxic, are commonly found in

industrial waste waters, for example, from wood and polymer processing, textiles

and dye industries (Bratkovskaja et al, 2004). A majority of them are carcinogenic,

and as time passes, they can accumulate to critical levels of concentrations in nature.

Usually, these phenolic compounds are treated by techniques like solvent extraction,

degradation by microorganisms and adsorption on activated carbon (Bratkovskaja et

al; 2004). However, these methods are inefficient and costly. In order to handle a

wide spectrum of phenolic compounds, enzymes have been introduced and the use

of peroxidase was first proposed by Klibanov et al. (1980) to oxidize aromatic

compounds.

At the beginning, peroxidase was extracted from plant sources like soybean and

horseradish (Bratkovskaja et al., 2004) in a laborious and expensive means. Thus, in

recent years, scientists have started to study a peroxidase from C. cinerea (CiP).

Since then, this non-ligninolytic fungal peroxidase has been extensively

characterized and studied (Petersen et <3/., 1994; Ikehata et al., 2004; Bratkovskaja et

al., 2004; Houborg et al., 2003). An integrated enzymatic system consisting of C.

cinerea peroxidase production, processing and usage in reactors is also being

developed for removal of phenolic compounds from aqueous water wastes (Mao et

al., 2006).

11 1.5.4 Transformation and gene silencing

Before 1986, Schizophyllum commune was the only basidiomycete for which a stable transformation system had been reported (Munoz-Rivas et al, 1986).

However, for the various advantages of C. cinerea, scientists started to investigate the possibility of transformation in this model fungus. The first report of

DNA-mediated transformation and homologous integration of C. cinerea was released soon after (Binninger et al., 1987). The transformed DNA was stable through cell division, mating, fruiting body formation and meiosis. Transformation had also been employed to increase gene copy number and gene expression (Mellon

and Casselton, 1988).

However, these reverse-genetics tools were inefficient and thus could not

support extensive functional studies of genes. With the emergence of RNA-induced

gene silencing (RNA silencing), the use of double-stranded RNA (dsRNAs) had

become a powerful tool for gene targeting in fungi (De Backer et al., 2002).

Namekawa et al. (2005) had successfully knock-downed the LIM15/DMC1 gene

responsible for homologous chromosome synapsis during meiosis by transformation

and expression of a LIM15/DMC1 dsRNA expression construct. Basidiospore

production was reduced to only 16% while 60% of the basidiospores were viable.

Silencing using a hairpin construct of the cgl2 gene had also been reported (Walti et

al., 2006). Expression of the hairpin RNAs successfully reduced the mRNA level of

the target genes by at least 90%, and demonstrated the possibility of simultaneous

silencing of a whole gene family by a single construct. These studies highlighted the

applicability of targeted gene silencing for reverse genetics studies in this model

mushroom.

12 1.5.5 Other studies

Other areas of research in C. cinerea are diverse in nature. One area is the studies of enzymes isolated from the mushroom. The laccase family inevitably receives the most attention from the scientists. It is repeatedly found to be linked to lignin degradation, thus implying its prominent role in delignification (Leonowicz et al., 2001). Besides, laccase is believed to participate in various physiological

processes like fruiting body formation (Labarere and Bemet, 1978), synthesis of

melanins (Langfelder et al., 2003) and probably pigments production in mushroom

tissues and basidiospores (Leatham and Stahmann, 1981). In addition to laccase, a

bifunctional delta 12/delta 15 fatty acid desaturase (Zhang et al., 2007) and a

dihydrofolate reductase (Aimi et al., 2004) had also been studied.

Mutant analyses have also been carried out in C. cinerea research. Two mutants

were extensively studied: a temperature-sensitive mutant showing swelling at

hyphal apices when it is grown at 37� C(Maida et al., 1997) and a mutant sensitive to

gamma radiation but insensitive to UV radiation (Ramesh and Zolan, 1995). In

earlier years, a number of approaches were also employed to study the chromosomes

of C. cinerea. These include gene mapping using molecular markers and marker

chromosomes (Zolan et al., 1993), chromosome seperation using contour-clamped

homogeneous electric field (CHEF) electrophoresis and construction of

chromosome-specific genomic libraries (Zolan et al., 1992) and silver staining of

meiotic chromosome (Pukkila and Lu, 1985).

1.6 C. cinerea Genome Project

The Coprinopsis genome project is a collaboration between the Broad Institute

(A partnership among MIT, Harvard and affiliated hospitals) and the Coprinopsis

research community. By the fall of 2003, an openly accessible 10-fold sequence

13 coverage of genome assembly based on the monokaryotic strain Okayama-7 (#130) was released. The assembly comprises 431 contigs which can be further assembled to give 106 supercontigs. The supercontigs range in size from 2 to 4142kb, with an

average length of 342kb. The combined contigs account for a total of approximately

36.25Mb, representing 96% of the C. cinerea genome.

In early 2006, a predicted protein gene set containing 13,544 genes was

released. Currently, BLAST databases for the entire assembly, as well as for

individual gene and protein predictions, are available. Also, all the features and a

particular region of genome assembly can be searched, visualized and downloaded.

1.7 Transcriptome Analyses

1.7.1 Serial Analysis of Gene Expression (SAGE)

Gene expression determines the overall characteristics of an organism. In the

era of functional genomics, investigation of the transcriptomes becomes essential as

it gives temporal, spatial and order information of genes being expressed. In earlier

years, techniques like cDNA subtraction and differential display were used for

comparing gene expression differences between two cell types. Yet, they failed to

provide direct information on abundance (Liang and Pardee, 1992). Other methods

including expressed sequence tags (EST) (Adams et al., 1991), RNA blotting,

ribonuclease (RNase) protection and reverse transcriptase-polymerase chain reaction

(RT-PCR) are able to deliver information on abundances, but they only evaluate a

limited number of genes at a time (Velculescu et al, 1995). Sun et al. (2004) had

also shown that SAGE is far more sensitive than the EST approach for detecting

low-abundance transcripts.

Nowadays, global transcriptome analyses are mainly conducted by

hybridization to oligonucleotide microarrays or by counting of sequence tags. The

14 latter involves two main techniques: the serial analysis of gene expression (SAGE) and the massive parallel signature sequencing (MPSS) (Brenner et al., 2000). Since

MPSS can only be performed by experienced technicians at a high cost, so SAGE is a more preferred method when compared with MPSS.

SAGE was invented by Velculescu et al. (1995) and allows simultaneous and quantitative analysis of a large number of transcripts. Their first investigation using

SAGE successfully identified 840 tags, representing 428 sequences from the human pancreas. When compared to oilgonucleotide microarray, although both systems are capable to generate information on how genes differ in their expression levels in different samples, SAGE has a few advantages. First of all, only known genes can be spotted onto the arrays while SAGE is able to measure expressions of both known and unknown genes (Nielsen et al, 2006). Also, SAGE allows better quantification of transcripts in absolute number and is able to detect low abundance transcripts, i.e. single-digit copy counts are still detectable while microarray is likely to miss. Lastly, cloning of all cDNAs is not required in SAGE, and this is particularly important when commercial microarray slides are not available (Van

Ruissen et al” 2005).

After the introduction of SAGE in 1995, it underwent a number of

modifications in order to increase the specificity of the tags for transcript

identification and mapping to the genomes. These included LongSAGE (Saha et al.,

2002), which extended the tag length to 21 bp by using a type lis restriction enzyme

Mmel, Super SAGE (Matsumura et al., 2003) which extended tag length to 26bp by

using EcoFlSl as the tagging enzyme, and 3’SAGE (Wei et al., 2004) extracted the

first 18bp upstream of the polyA tail. Generation of longer 3’ cDNA from SAGE

tags for Gene Identification (GLGI) and 3’ rapid amplification of cDNA ends

(RACE) was often coupled to SAGE, producing a 3' EST with up to a few hundreds

15 of bases (Wang, 2006).

1.7.2 Analyzing the 5' end of transcripts

A number of modifications based on the principles of SAGE were applied in order to suit new platforms for different areas. One of these is to analyze the most 5' end bases (the 5' untranslated region, UTR) of each mRNA transcript, commonly known as 5'-end SAGE (Hashimoto et aL, 2004) or 5,SAGE (Zhang and Dietrich,

2005). Studying the 5' end of transcripts allows identification and mapping of transcription start sites (TSS) and subsequent investigation of the promoter elements by looking at the regions upstream. Mapping of TSS also has potential contribution to our understanding of gene regulation, transcription, mRNA stability and aspects of RNA biology (Zhang and Dietrich, 2005). A few methods have been described to

study the 5' end of transcripts, and most of them involved the use of type lis

restriction endonuclease Mmel,

Cap Analysis Gene Expression (CAGE) was among the first studies employing

the SAGE principles to analyze the 5' end of transcripts (Shiraki et aL, 2003). The

method involved selection of full-length cDNA by biotinylated cap trapper and

subsequent release of the 20bp 5' SAGE tags by Mmel. The tags are then joined into

ditags and the procedures afterwards are identical to the original SAGE protocol.

The study successfully analyzed four libraries from brain, cortex, hippocampus and

cerebellum of mouse, comprising a total of 60,922 tags with an average mapping

rate of 58.5%.

Some researchers added adaptors containing Mmel recognition site to the

mRNA or cDNA end. Hashimoto et al. (2004) applied the oligo-capping method

(Maruyama and Sugano, 1994), which involves the use of bacterial alkaline

phosphatase and tobacco acid pyrophosphatase, to replace the cap structure of the

16 eukaryotic mRNAs by oligoribonucleotide adaptor. Ng et al. (2005) added double-stranded adaptors to the newly synthesized first-strand cDNA and

subsequently added another adaptor to the 3' end of the double-stranded cDNA,

thereby linking the 5' and 3' ends of a transcript to a single ditag. The ditag was

named paired-end ditag (PET) and the method was recognized as Gene

Identification Signature (GIS).

Another way to analyze the 5’ end of the mRNA transcript is to employ the

terminal transferase and template switching properties of reverse transcriptases. This

application was first described by Schmidt and Mueller (1999) for enrichment of full

length cDNA in PCR-mediated analysis of mRNAs, and was later applied in 5'

SAGE analysis (Zhang and Dietrich, 2005). Since reverse transcriptases usually add

a few deoxycytosine nucleotides to the newly synthesized cDNA, this serves as a

mean to introduce enzyme recognition sites to the 5' cDNA end with the template

switching process. The study successfully identified 13,746 unique sequence tags

from 2231 S. cerevisiae genes, and thus highlighted the applicability of this

technique to analyze the 5’ end of mRNA transcripts.

17 Table 1.1 General tag-based approaches for transcriptome analyses developed from the SAGE techi

Method Description Tag len§

Quantitative analysis of transcriptome

SAGE (3,-related) Tags extracted from the 3' end regions using NallU as restriction enzyme ~14bp

SAGE (5'-related) Tags extracted from the 5' end regions using Nallll as restriction enzyme 14-20bp

LongSAGE Modified from SAGE extracting tags from 3' end regions using Mmel 21 bp

SuperSAGE Modified from SAGE extracting tags from 3’ end regions using EcoFlSl 26bp

Quantitative analysis of transcriptome and genome annotation

CAGE Tags extracted from the most 5' end by cap-trapping and Mmel digestion ~20bp

5 ’ SAGE Tags extracted from the most 5 ’ end by oligo-capping and Mmel digestion �17bp

3' SAGE Tags extracted from the most 3' end upstream of the Poly(A) tail �18bp

GIS Ditags comprising tags from the most 5' and 3' end of the same transcript ~36bp

18 1.7.3 Mapping of SAGE tags to the genome

The reliability of SAGE and 5' SAGE analyses depends on accurate and unambiguous mapping of the tags to the genome, and this is the critical step concerning the amount and quality of information to be extracted from SAGE experiments. Considering the efficiency of SAGE or 5' SAGE for gene discovery and annotation, genomic information provides the best source for tag mapping and gene discovery (Malig et al, 2006). However, the use of genomic sequences represents a bioinformatics challenge as the complexity of large genome makes unique tag mapping more difficult (Wahl et al., 2005). Increased tag length also markedly decreased the mapping rate of LongSAGE tags. For example, only 22%

(137,333 out of 632,814) of human LongSAGE tags could be mapped to the

SAGEmap database (Wang, 2006). It is believed to be related to the increased probability of incorporating base errors or single nucleotide polymorphisms in longer tags.

Keime et al. (2007) obtained similar observations. They examined all public human LongSAGE libraries and deduced that more than 70% of the tags still did not match the genome sequence after removing those likely to be resulted from sequencing errors. Some of these tags corresponded to parts of human mRNAs, such as polyA tails, junctions between two exons and polymorphic regions of transcripts.

Even for the tags mapped to the genome, 31% corresponded to unannotated transcripts and for those mapped to known transcribed regions, a number were located in antisense or in new variants of these known transcripts. They concluded

that the human genome is much more complex than shown by the current genome

annotations.

In 5' SAGE analyses, mapping percentages are generally higher, possibly due to

a slightly shorter tag length (Hashimoto et al., 2004). In several 5' SAGE studies on

19 human, yeast, maize, it was reported that unique tag mapping rate was higher than

70% (Hashimoto et al., 2004; Zhang and Dietrich, 2005; Gowda et al., 2006). These

studies were able to improve current transcription start site annotations in reference

databases and published literature. Most of them discovered the diversity of TSS on a

cellular level and that some genes could have as many as 16 TSS, although some

TSS are more preferred. Hashimoto et al. (2004) also anticipated that alternative

transcription may frequently induce alternative splicing. Moreover, Zhang and

Dietrich (2005) confirmed and expanded the previously reported consensus pattern

flanking the transcription start sites. In addition, 5' SAGE is a valuable tool for

predicting novel genes and identifying potential non-coding RNA genes and

upstream open reading frames.

1.8 High-throughput sequencing

1.8.1 Pyrophosphate sequencing

Traditionally, sequencing is performed mainly through Sanger sequencing. In

large-scale sequencing projects like the whole-genome sequencing, DNA fragments

have to be cloned into bacterial vectors followed by amplification, purification of

templates and finally sequencing using fluorescent chain-terminating nucleotide

analogues and either slab gel or capillary electrophoresis. In this way, the cost of

sequencing a human genome was estimated to be $10-25 million (Margulies et al.,

2005).

Although alternative methods such as detecting the release of pyrophosphate

(Ronaghi et al” 1996) had been described, they did not come into large-scale

applications until Margulies et al. (2005) invented an integrated system based on

pyrophosphate sequencing (pyrosequencing) in a sequencing-by-synthesis approach.

The system employed an emulsion-based method for DNA isolation and

20 amplification and used a novel fibre-optic slide to perform sequencing in picolitre-sized wells. It was capable to generate over 25 million bases with a Phred quality score of 20 or higher in one four-hour machine run. Despite a substantially shorter reads and lower average individual read accuracy compared with Sanger sequencing, the system showed outstanding utility, throughput, accuracy and robustness in a comparison on sequencing the genome of the bacteria Mycoplasma genitalium, presenting a 96% coverage and 99.96% accuracy in a single run.

1.8.2 Applications of pyrosequencing

The pyrosequencing system has applications in many research fields such as whole genome sequencing (Hofreuter et al., 2006), chromosome structure (Albert et al., 2007), small RNA (Berezikov et al., 2006), amplicon analyses (Thomas et al.,

2006) and transcriptome analyses. In fact, transcriptome analyses have been much facilitated by this pyrosequencing approach as it does not require purification and cloning of concatemers, and eliminates the tedious steps of colony picking and

plasmid DNA purification (Gowda et al, 2006).

The system had been used to study transcriptomes of maize shoot apical

meristem (SAM) (Emrich et al., 2007), in which >25,000 maize genomic sequences

were annotated and about 400 expressed transcripts which have not been identified in

other species before were captured. Gowda et al. (2006) had also analyzed the 5' end

of transcripts of maize, and demonstrated the existence of complex alternative

transcription start sites, promoter regions and 5' poly(A) tail in these transcripts.

Nielsen et al. (2006) had also investigated transcripts of potato tubers at the

time of harvest and at dormancy, and compared with the LongSAGE data to

demonstrate the greater power of detection and multiplexing of samples. The analysis

allowed counting of more than 300,000 tags with less effort and cost, thus facilitating

21 the measurement of rare transcripts which are beyond detection limit of existing global transcript profiling technologies (Nielsen et al., 2006).

In addition, the pyrosequencing system had also been used to study the transcriptomes of prostate cancer cell line (Bainbridge et al., 2006), Medicago

truncatula (Cheung et al; 2006) and Arabidopsis (Weber et al., 2007). All these

researches highlight the applicability of the system as a powerful profiling method in

complex genomes.

1.9 Aims of project

It has long been a goal for the fungal research community to understand the

mechanisms underlying the fruiting process as well as the roles and biological

functions of different genes. With the genome sequenced and the protein gene set

available, performing 5' SAGE on the mycelial and primordial stages of Coprinopsis

cinerea can generate large amount of information on gene expression among these

stages, thereby provide clues on how genes behave during the fruiting process as

well as other cellular and physiological processes. In addition to expression data, 5'

SAGE can also facilitate genome annotation through mapping the SAGE tags to the

genome to indicate transcription start sites. Promoter studies can be facilitated as

well since upstream of the transcription start sites is the promoter region.

Therefore, the project aims to perform 5' SAGE analysis from dikaryotic

mycelium and primordium of Coprinopsis cinerea and map these SAGE tags to the

genome. The reliability and accuracy of the 5' SAGE data set will be confirmed by

Northern blot analyses and quantitative real time PGR of selected genes. In this way,

the genes differentially expressed in these two developmental stages can be identified

and characterized. Through comparing with the previously established SAGE and

EST datasets from another edible mushroom Lentinula edodes, important players

22 during fruiting body initiation can be revealed and a better understanding of the fruiting body development is possible.

23 Chapter 2 5,Serial Analysis of Gene Expression (5,SAGE)

from mycelial and primordial stages of C. cinerea

2.1 Introduction

Coprinopsis cinerea, commonly known as the inky caps, is one of the model organisms widely used to study developmental processes and mushroom biology in homobasidiomycetous fungi. Over the past few decades, various molecular techniques had been applied to generate large amounts of data in the fields of fruiting body development, meiosis, mating factors, enzyme production and gene silencing for this mushroom.

As gene expression determines the overall characteristics of an organism,

transcriptome analysis has become popular in the era of functional genomics. Until

now, no comprehensive analysis of the transcriptome of C. cinerea has yet been

performed. In fact, its easy cultivation in laboratory and the availability of the

genome sequence facilitate extensive downstream genetic and molecular studies. 5'

SAGE analysis on C. cinerea generates precious data towards a better understanding

of its various physiological processes.

In the earlier years, gene expression data was generated by techniques like

cDNA subtraction and differential display, but they could not provide direct

information on abundance (Liang and Pardee, 1992). Other methods like expressed

sequence tags (EST) (Adams et al., 1991), RNA blotting, ribonuclease (RNase)

protection and reverse transcriptase-polymerase chain reaction (RT-PCR) are able to

deliver information on abundances, but they only evaluate a limited number of genes

at a time (Velculescu et al., 1995). Nowadays, transcriptome data is mainly generated

through the SAGE-derived technologies (Velculescu et al., 1995) and oligonucleotide

microarrays. The original SAGE protocol was capable to extract short sequence tags 24 (10-14bp) from each mRNA transcript near the 3,end, and each tag contains sufficient information to uniquely identify a transcript. The frequency of each tag reveals the expression level of the corresponding gene.

Since SAGE was invented in 1995, it underwent a number of modifications, for instance, LongSAGE (Saha et al.,2002) and SuperSAGE (Matsumura et al., 2003) extended the tag length to 21 and 26bp respectively, and 5' (Zhang and Dietrich,

2005) and 3’ SAGE (Wei et al, 2004) define the transcription start sites and termination sites. Compared with the original SAGE, 5' SAGE allows comparison of gene expression patterns between different developmental stages, and also maps the transcription start sites to the genome, thereby facilitating promoter analysis.

Figure 2.1 shows a schematic diagram illustrating the major procedures of the 5'

SAGE. First of all, mRNA isolated from total RNA of the respective stages are reverse transcribed into cDNA using SuperScript III reverse transcriptase and oligo(dT) primer. As the enzyme adds a few deoxycytosine residues to the 3' end of the newly synthesized cDNA, a template switching primer was added to introduce additional sequences containing the restriction enzyme Mmel recognition site. Mmel cuts 20/18bp downstream of the recognition site, resulting in tags of about 50bp long.

Two tags were ligated together to form a ditag.

25 Template switching primer mRNA MI^H g g g^HH^^HH^H^^^HI^HB^H AAAAAAAAAA - C C C _ TTTTTTTTTT

,. I ds cDNA synthesis Anchor primer 蚕 i^HHi —GGG _ 4 C C C_ TTTTTTTTTT

20/18bp J^ Restriction digestion Mni^ mmmmm'^oGG • 4 C C C _ TTTTTTTTTT I

J CCCM^^M) X2

50bp

_1 GGG —— CCC • _ CCC^m^mm mm^^^am GGGm^mmm

100bp

Figure 2.1 A schematic diagram illustrating the major procedures in 5’ SAGE. The isolated mRNAs from mycelium and primordium were reverse-transcribed using SuperScript III reverse transcriptase and template switching primers. Anchor primers and oligo(dT) were used for ds cDNA synthesis. The cDNAs were then digested by Mmel to generate 50bp tags. Two 50bp tags were ligated to give a lOObp ditag, which will be sequenced by sequencing-by-synthesis.

The ditags were then sequenced in a sequencing-by-synthesis approach at 454

Life Sciences in an integrated system based on pyrophosphate sequencing. Individual tags were then extracted from the ditags using Perl scripts. As the current C. cinerea genome annotation contains no gene expression data, these tags provide valuable information on the expression of genes in the mycelial and primordial stages.

C cinerea genome annotation is still ongoing and can be accessed online at

26 http://genome.semo.edu/cgi-bin/gbrowse/cc/. The website uses genomic sequence

data from Broad Institute and is maintained by the Southeast Missouri State

University. Gene predictions and other information are continuously contributed by

the Coprinopsis research community. Under this genome annotation, all predicted C.

cinerea genes exist as gene models known as the GLEAN models (a model

comprising the exons and introns, franked by the start and stop codons). All

important features, such as Coprinopsis known genes, ESTs, Pfam domains and

BLASTX results, are readily available as different track options. Users can search the

genome in terms of genome assembly contig positions, GLEAN models, domains,

genes or even EST identities. Figure 2.2 shows an overview of the Coprinopsis

cinerea genome annotations website.

27 Coprinus cinereus genome annotations�Broa dcontigs)

Showing 1 kbp from ccin—Contigl 79. positions 131.421 to 132,420 曰 lnfttmctioiis Search using a sequence name, gene name, locus, or other landmark The wildcard character * is allowed. To center on a location, click the ruler. Use the Scroll/Zoom buttons to change magnification and position. Examples; ccin_Contigie9. ccin_Coritig39:315098..319097, Domain:Ras*. gene:a3^. EST:CCRAP6*. benC, mipD. [HUle bniiiierl [eookinark titis] CLfiik to liiioge]�High-re lmage|s �Help�••• B Sft 日 Ovenview Overview of ccxn_Cont;lv±79 J , Ok~‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ 100k ‘ ‘ “‘ ‘ ‘ ‘ ‘ 200k ‘ ‘ ‘ ‘ ‘ _ _ 3ci)k ‘ ‘ ^ 曰 Dfet.ills _ • • I . . • • _ • ‘ 131.5k 131.6k _ ‘ ‘ _ • isi'.lk . ‘ ‘ •‘ isi.SK ‘ ‘ ‘ ‘ ‘ 131.9k _ , , _ , ‘ i32k •_•'•, • ‘ • ‘ ‘ isz'.ic! ‘ ‘ _ _ • iaz'.ik! ‘ ‘ ‘ ‘ ‘ isz'.ik GLEAM nodels IJ . J J J J _! J _ij _ ji-iLiiujT • asmiaMsimm Gene;GLEftN_gz2_08311 ^BMH^^g 卩”丨, GLEflN_06454 fli mf round2^GLEftN_06336 Alternate GLEAN nodels Gene : J an06tn300_GLEflN_07785 —__^^^^^^^^^^^^^^^^^^^^^^ Gene;Jan06w400_GLEftN,08027 Gene ! J an06_GLE AN _07977 GLEAN Pr«i% donalns Pom a in;GciI-bl nci_ 1 ect i n NotezGailactosidtf一binding lectin evalue:!.4e-21 Protein Slitilarleles Protein:AAB04141.1 galactose binding lectin CCoprlnopsls c1nerea] rvalue:0 Protein:AAB06178.1 ga 1 actose binding lectin CCoprinus cinereus] es.'aluetO Protein:CGL1_C0PCI Galectin-1 CGalectin I) CCgl-I) evalueiO Protein:CGL2_C0PCI Galectin-2 CGalectin It> CCgl-II) evaluerO Prote1n:CGL2_C0PCI Galectln-2 CGalectln II> CCgl-II) evalue:0 Proteinzgil323008lpirM endonuclease (EC 3.1.-.-> 12K chain - inky cap CCoprinus cinereus> Cfragment) evalue:0 Protein:AftP93924.1 lectin [i=tgrocybe aegerlta? evalue :4 .3e-10 Protein:ATLE„AGRAE Anti-tumor lectin evalue:6.7e-12 Prote 1 n: LEC9_HUf1AN Galectin-9

69839:^-1 •••丨.,__丨.丨 I, •卿丨 11•丨• 一一一 ^ ^^^^ .eg 11 “ CI«Ji h.ghl.ghtmg Updag Iioa^ | ...... — . — - - - B Tracks 曰 Analysis U^AHon \I\Ait off Q Restriction Sites El Genfe {3AUon [3 A ft off 0 Alternate GLEAN models •AUGUSTUS (Genomewise trained) S GLEAN models OXwinscan predictions El Ofriiferal {DAffon UjAffoff Q3-frame translation (forward) DESTs as mRNA Q Predicted Genes (Genornewise) Q SNAPgene (•3-frame translation (reverse) • ESTs with no SNAP prediction Q Predicted Genes (GlimmerM) QtRNAs • Broad Predicted Genes S EXONERATE EST alignments • Schizo ESTs alignment • Yeast CDS 0 Coprinus known genes Q EXONERATE protein alignments • SE Subcontigs(best hit) • ONA/GC Content • Known Coprinus genes alignment O SNAP (retrained) predictions • ESTgene dPHRAP EST contigs • SNAP predictions B Other Features OA//or) UjA/foff • RepeatMasker • RepealMasker (Stajich 5/07) 曰 Siinilaiilies UJ A/fan CJA/Zoff 0 GLEAN Pfam domains B Protein Similarities Conflx^ic tra^- I Update Iioa^ |

Figure 2.2 An overview of the C. cinerea genome annotations website, using galactose binding lectin {cgl2) as an example to illustrate various features available for analysis (http://genome.semo.edii/cgi-biii/gbrowse/cc/).

28 2.2 Materials and Methods

2.2.1 5,SAGE libraries construction

2.2.1.1 Mushroom mycelium and primordium cultivation

The C. cinerea strain used is a dikaryotic strain (with 50% Okayama-7 origin) backcrossed with Okayama-7, the monokaryotic strain used for the genome sequencing project, for 5 generations. This means that the strain is �98.44% homologous to Okayama-7 while retaining the ability to fruit, thus allowing fruiting bodies to be analyzed.

The C. cinerea strain was cultivated on YMG medium containing yeast extract, malt extract and potato dextrose solidified with Bacto® agar. The mycelium was cultured on agar plates at 37°C for about 7 days until the mycelium covered the whole agar surface. The primordium was induced by incubating the mycelial culture at 25°C under a light/dark regime of 14/lOhr. The incubator was kept at a relative humidity higher than 60%.

2.2.1.2 RNA extraction

Total RNA was extracted from mycelium when they grew over the whole agar

surface, and from stage 1 primordium when they grew to a height of about 5mm.

The RNAs are extracted by TRI® reagent (Molecular Research Center, Inc). All

sample handling instruments including mortars, pestles, spatulas, forceps and blades

are pre-baked at 300� Covernight and cooled to -20� Cbefore use. Latex gloves and

pipette tips were free of RNase contamination. Samples were freshly scraped or cut

into the mortar soaked with liquid nitrogen and were immediately grinded to

powder-like form by the pestle. About lOOmg of samples were then transferred to

each RNase-free microcentrifuge tube containing 1ml of TRI® reagent and the tubes

29 were vortexed for 15 minutes at room temperature. After vortexing, the tubes were centrifuged at 4°C at 12,000xg for 10 minutes and the supernatant were transferred to new microcentrifuge tubes. 0.2ml chloroform was then added to each of the tubes followed by vigorous shaking of 15 seconds and incubation at room temperature for

15 minutes. The tubes were centrifuged at 4°C at 12,000xg for 15 minutes. After centrifugation, the RNA-containing top aqueous phase was carefully transferred to new microcentrifuge tubes, where 0.25ml salt solution (0.8M sodium acetate, 1.2M sodium chloride) and 0.25ml isopropanol were added for RNA precipitation. The mixtures were then incubated at -20°C for half an hour followed by centrifugation at

4°C at 12,000xg for 8 minutes. The supematants were discarded and the RNA pellets were washed by 1ml 70% ethanol. The washed pellets were centrifuged again at

12,000xg for 5 minutes. After discarding the supematants, the pellets were allowed to dry in room temperature for 8 minutes and subsequently resuspended in 20|il of

DEPC-treated water by incubating at 55°C for 10 minutes. The RNAs were kept at

-80°C until use.

As the quality of RNA is crucial for SAGE experiments, the integrity and purity

of RNA had to be checked beforehand. The quality of RNAs was examined by

loading 1|li1 of the samples in 1.2% formaldehyde agarose gel. Concentration and

OD260/280 ratio were determined by Gene Quant II (Pharmacia Biotech) by diluting

the samples 70 times with DEPC-water.

2.2.1.3 Isolation of mRNA

Isolation of mRNA was performed with PolyATract® mRNA isolation system

from Promega. Procedures were based on manual from manufacturer with some

minor amendments. Two separate reactions were set up: one for mycelium and one

for primordium. First of all, about Img of total RNA was pooled and brought to a

30 final volume of 500|LI1. The RNA was incubated at 65°C for 10 minutes followed by addition of 3|il of biotinylated-Oligo(dT) and 20|il of 20X SSC solution. The tubes were allowed to cool to room temperature for probe annealing.

Tubes containing streptavidin-paramagnetic particles (SA-PMPs) were resuspended by gentle flicking and subsequently captured by the magnetic stand and washed by 0.5X SSC (300jjl per wash) for 3 times. After the final wash, the

SA-PMPs were resuspended in 100|LI1 of 0.5X SSC.

Then, all the content of the annealing reaction was added to the washed

SA-PMPs followed by incubation at room temperature for 10 minutes with occasional mixing by inverting every 1-2 minutes. The SA-PMPs were captured and washed with O.IX SSC (300ILI1 per wash) for 4 times. After the final wash, the mRNAs were eluted twice, with lOOjul and ISOjil of RNase-free water respectively.

The eluted mRNAs were precipitated by mixing with 50|ul of 3.75M ammonium acetate, 300]li1 100% isopropanol and 3[i\ glycogen. The mixture was incubated at

-20°C overnight, followed by centrifugation at 14,000xg for 20 minutes. The pellets

were washed with 1ml cold 70% ethanol and finally resuspended in 12|al

DEPC-water.

2.2.1.4 cDNA synthesis

First strand cDNA was synthesized using SuperScript™ III First-Strand

Synthesis System for qRT-PCR from Invitrogen. Two separate first strand synthesis

and template switching reactions were applied for each developmental stage

(mycelium and primordium). Each of the reactions contains 5|Lig of mRNA, 2|ul of

template switching (TS) oligo A or B (TS oligo A:

5 ’ -GGGATTTGCTGGTGC AGTAC AGGATCCGACggg-3,; TS oligo B:

5‘-GCTGCTCGAATTCAAGCTTCTGGATCCGACggg-3 ‘, where 'g' stands for

31 ribonucleotide), and 9)al of DEPC-H2O. The reactions were incubated at 65� Cfor 5 minutes and were then placed on ice immediately. After that, 20|nl of 2X RT Reaction

Mix and 4jul of RT Enzyme Mix were added to the reactions to make the total volume 40|LI1. The tubes were then incubated at 42� Cfor 90 minutes followed by incubation at 85°C for 5 minutes. 2|LI1 of RNaseH was added for mRNA digestion at

37°C for 20 minutes. The reactions were kept on ice.

Second strand cDNA synthesis was performed by low cycle primer extension using Advantage® 2 polymerase (Clontech). 10|LI1 of lOX Advantage 2 PCR buffer,

4|al of dNTP (lOmM each), 4|il of CDS primer (lOjuM)

(5‘ -CAGTGGTATCAACGCAGAGTAC(dT)2oVN-3‘), 4ix\ of Anchor primer A or B

(lO^iM) (Anchor primer A: 5‘-GGGATTTGCTGGTGCAGTACAGGATCCGAC-3‘; anchor primer B: 5‘-GCTGCTCGAATTCAAGCTTCTGGATCCGAC-3‘), 3|li1

Advantage® 2 polymerase and 33|al ddHiO were added to each 42jLil first strand reactions. The reactions underwent a PCR cycle of 72°C for 5 minutes; 95°C for 45 seconds; 95°C for 10 seconds, 55°C for 30 seconds, 68� Cfor 4 minutes for 5 cycles and 68°C for 3 minutes. The PCR products were then purified using QIAquick PCR purification kit from Qiagen and were eluted in 30^1 of ddHiO. Thus, for each developmental stage, there were two portions of cDNA to be digested by Mmel.

2.2.1.5 Mmel digestion and Polyacrylamide gel electrophoresis

The 30|J1 CDNA synthesis reactions were mixed with 5|LI1 of lOX NEB buffer 4,

1.8|LI1 SAM (1.6mM), 5|LI1 Mmel (211尔1) and 8.2|al of ddHiO. The final volumes were

50jil and the reactions were incubated at 37°C for 2 hours for complete digestion.

Then, 150|al of low salt TE buffer (LoTE) (2.5mM Tris HCl, 0.25mM EDTA, pH8.0) was added to the reactions followed by extraction with 200|j1 phenol/chloroform.

After centrifugation,�200|li 1of the upper phase was transferred to new

32 microcentrifuge tube, where 133|il 7.5M ammonium acetate, 1ml 100% ethanol and

3|li1 glycogen were added. The mixtures were incubated at -20� Covernight followed by centrifugation at 15,000rpm at 4°C for 30 minutes. The cDNA pellets were then washed by 1ml 70% ethanol and finally resuspended in 10)^1 LoTE.

The digested cDNA fragments were electrophorezed using polyacrylamide gel.

The gel was cast at 15% using 40% (w/v) 29:1 acrylamide/bis-acrylamide solution in a final concentration of IX TBE solution and had a dimension of 8.3 x 7.3cm, with a thickness of 0.75mm. All volumes of digested cDNA fragments were loaded into the wells and were electrophorezed using Mini-PROTEAN electrophoresis system

(Bio-Rad Laboratories, Inc.).

After electrophoresis, the �50b pbands were excised and disrupted by mechanical force. Subsequently, 150|LI1 solution (125|LI1 LOTE, 25JLI1 7.5M ammonium acetate) was added and the mixtures were incubated at 4°C overnight followed by another incubation at 37 °C for 1 hour. The mixtures were then centrifuged at

15,000rpm for 10 seconds and ~150|ul solution was collected from each tube. At this point, the two portions of Mmel-digested cDNA were pooled together and were precipitated by 150|LI1 7.5M ammonium acetate, 1ml 100% ethanol and 3|LI1 glycogen

at -70°C for 4 hours. Pellets were then collected by centrifugation at 15,000rpm at

4°C for 30 minutes. The pellets were washed by 1ml cold 70% ethanol. After drying

in air for 5 minutes, the Mmel-digested cDNA was resuspended in 5JLI1 LOTE.

2.2.1.6 Formation and amplification of ditag

The two portions of Mmel-digested cDNA were ligated to form lOObp ditags.

The 5|il of digested products were mixed with l|ul lOX T4 ligation buffer, 1.5jul

Tris-HCl, 1.5jil T4 DNA ligase (400U/|il) and l^il ddHzO. The ligation reaction

proceeded at 16°C overnight and was terminated by incubation at 65°C for 10

33 minutes.

The 10|Lil ligation products were divided into two portions for PGR amplification of the ditags. Each reaction contained: 5|LI1 lOX Mg-free buffer, 3.5|il

MgCl2, 2.5|il dNTP (lOmM each), 5|LI1 anchor primer A (lOuM), 5|LI1 anchor primer B

(lOuM), 5|il ditags, 1|LI1 Platinum® Taq polymerase and 22.5|nl dcfflbO to make the final volume 50|LIL The mixtures then underwent a PGR cycle of 95� Cfor 2 minutes;

95°C for 30 seconds, 65°C for 45 seconds, 72� Cfor 20 seconds for 10 cycles and

72°C for 3 minutes.

The two PGR reactions were pooled together with addition of lOOjul LoTE and

extracted by 200|LI1 phenol/chloroform. ~200|il upper phase was collected and was

precipitated by 133|LI1 7.5M ammonium acetate, 1ml 100% ethanol and 3jal glycogen

at -20°C overnight. Pellet was obtained by centrifugation at 15,000rpm at 4� Cfor 25

minutes and subsequently washed by 70% ethanol. Finally, the pellet was

resuspended in 15|LI1 ultra-pure water.

2.2.2 Identification of lOObp ditag

The identity of the lOObp ditag was confirmed by TA cloning using cloning

vector pMD18-T (Takara Biotechnology). For both mycelial and primordial stages,

2|ul of the PGR product of the lOObp ditag was mixed with 0.5|nl pMD18-T vector

and 5|nl ligation mix. The reaction mixtures were incubated at 16°C for 30 minutes

followed by incubation with DH5a competent cells on ice for 30 minutes. The cell

cultures then underwent heat-shock transformation at 42°C for 45 seconds and were

placed on ice for more than 1 minute. After that, 890|Lil SOC medium was added to

the cultures followed by incubation at 37°C for an hour. The cells were then spun

down and total volume was reduced to 200|il. lOOjul of each culture was spread on

LB-ampicillin selection plates and the plates were incubated at 37°C for 16 hours.

34 Single colonies were screened using screening primers ^caBest™ Ml3-47 and

5caBest™ RV-M. The PGR protocol was similar to that used for amplification of the lOObp ditag, except that the annealing temperature was changed to 55°C and extension lasted for 30 seconds for 30 cycles.

Products of PGR screening with expected size were purified using Qiagen PGR purification kit and were eluted in 40|LI1 ddHsO. 15|LI1 was used for sequencing with

BcaBesi™ Ml3-47 primer at TechDragon Limited.

2.2.3 High throughput pyrosequendng

The ditags were shipped on dry ice to 454 Life Sciences (Connecticut, U.S.A.) for pyrosequendng on GS20 sequencer. The samples from the mycelial and primordial stages were run on two separate region metrics. Before shipping, the ditags were checked by agarose gel electrophoresis and spectrophotometry. OD ratios and DNA concentrations were recorded.

2.2.4 Tags extraction from ditags

Perl scripts were written to extract individual tags from the ditags and for genome mapping and expression abundance comparison. The sequenced ditags were first checked for the presence of the two Mmel recognition sites and that the length flanked by the two recognition sites was between 38-40bp. Qualified ditags were

then subjected to tag extraction.

From each of the ditags, the first tag was obtained by extracting 20bp

downstream of the first Mmel recognition site, and the second tag was defined as

20bp upstream of the second Mmel recognition site. Subsequently, individual bases

of the tags were checked for Quality Scores. All bases were ensured to attain a

Phred-equivalent quality score of at least 20 (a score of 20 implies that the estimated

35 error probability of that base is 0.01), except for those homopolymer stretches bases in which the fourth nucleotide (and onwards) usually has a lower quality score.

Qualified tags then had all the starting 3-5 guanine nucleotides removed (as these Gs were added during reverse transcription by the reverse transcriptase). This resulted in tags with a length of 15-17 nucleotides.

2.2.5 Genome mapping and annotation

The individual tags were first grouped to generate a set of tags known as the unique tags (non-duplicated tags). Perl scripts were written to match the tags to the genome sequence and the numbers of tags for ‘unique match to genome', 'multiple matches to genome' and ‘no matches to genome' were recorded. Those with a single exact match to the genome were retained for further downstream expression analysis.

Moreover, the positions and orientations where the tags were matched with respect to each gene (Open reading frame) were also recorded in relations to the putative

5,-UTR, coding region and putative 3'-UTR.

The full set of mycelial and primordial unique tags were uploaded to the C.

cinerea genome annotations website in form of a track comprising the ‘5, SAGE tags

(mycelium)' and '5' SAGE tags (primordium)' sub-tracks.

36 2.3 Results

2.3.1 5,SAGE libraries construction

2.3.1.1 cDNA synthesis

The integrity and purity of the total RNA extracted were assessed by formaldehyde agarose gel and spectrophotometry. The ratio of the absorbance

(A260/280) was around 1.8, indicating little protein and DNA contaminations. On tY\e

denaturing gel, the RNA showed clear and sharp 28S and 18S rKNAbands and no

obvious degradation of RNA was observed.

The cDNA for 5’ SAGE libraries construction was synthesized from mRNA.

isolated from the total RNA. The first strand cDNA was synthesized using

superscript III reverse transcriptase and template switchmg primers. T\\e

double-stranded cDNA was synthesized by low cycle primer extension and sYiowed a

continuous spectrum of size ranged from �300b pto ~4kb on a 1.5% agarose gel,

indicating that transcripts of various sizes were included (Figure 2.3). 覼 Figure 2.3 Agarose gel electrophoresis of double-stranded cDNA. syntties from mRNA isolated from mycelium and stage 1 primordium cinerea. The cDNA showed a size range of ~300bp lo �4W oon t\\e \.5% c gel, indicating that the cDNA contains all transcripts of various sizes. 2.3.1.2 Mmel digestion and ditag formation

The cDNA from mycelium and stage 1 primordium was digested with Mmel at

37°C for 2 hours to give 50bp fragments. Following phenol/chloroform extraction and ethanol precipitation, the Mmel-digested fragments were electrophorezed on

15% polyarcylamide gel. Sharp bands at the 50bp position were observed, indicating successful integration of the introduced sequences through template switching primers during cDNA synthesis and digestion by Mmel (the length of the template switching primer is 30bp. As Mmel cuts 20bp downstream of the recognition site and the recognition site is located at the most 3’ bases, therefore the digested fragments were 50bp). The bands at ~50bp were excised, pooled, purified and ligated at 16°C overnight to give lOObp ditags. The ditags were then amplified by 10 cycles of PGR and the purified products were analysed on a 2% agarose gel. Figure 2.4 and 2.5 show the 50bp fragments following Mmel digestion and the lOObp ditags after PGR amplification.

{^^^^^•"•"•"•MmmoipiimiMHnilHPinPiSHPii .-'SC « .. , , … 1 4*:�� '< � J * "f^ ‘ < _1 “_ M 響 _警彳戀!fe m漏 ,麟:减塵‘ _

Figure 2.4 Polyacrylamide gel electrophoresis of the 50bp fragments after Mmel digestion of mycelial and primordial cDNA. Sharp bands at the 50bp position on a 15% PA gel indicated successful integration of Mmel recognition site during cDNA synthesis and Mmel digestion.

38 loobp —• MmH^^^^^^^^K

mm Figure 2.5 Agarose gel electrophoresis of lOObp ditags after low-cycle PGR amplincation. Main bands were observed at the lOObp position of the 2% agarose gel, indicating successful ligation of the two 50bp Mmel-digested fragments.

2.3.2 Identification of lOObp ditags

Small portions of the lOObp ditags from the mycelial and primordial 5’ SAGE libraries were cloned using TA cloning. Sequencing of some cloned lOObp ditags revealed that they contained the sequence information as expected. Figure 2.6 shows the information of one of the sequenced clones. The sequence consisted of the pMD18-T vector sequence (blue) flanking the ditag sequence, and in between the

two template switching primers (green and red) sequence are the two 5' SAGE tags.

By default, the first tag is extracted 20bp downstream of the first Mmel recognition

site and the second tag is extracted 20bp upstream of the other recognition site.

39 GTGAACACGGCAGTGCCAGCTTGCATGCCTGCAGGTCGACGATTGGGATTTGCTGGT

广~— Tag 1 � GCC4TACAGGATCCGACGGGGCACCAACTCCTTTTTTTGATAGCATGACGGCGCCCGT V Tag 2 ^

CGGATCCAGA?X4CTTGMTTCGAGCAGCAATCTCTAGAGGATCCCCGGGTACCGAGC

TCGAATTCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATGGTTTATCCGCTCAG

H pMD 18-T vector sequence H TS oligo A/B primer sequence

Mmel recognition site Tag sequence

Figure 2.6 Sequence information of one of the cloned lOObp ditags from the mycelial stage. The ditag sequence is flanked by the vector sequence, and the two 5' SAGE tags are located in between the two Mmel recognition sites.

2.3.3 High throughput pyrosequencing

Data from pyrosequencing on GS20 sequencer contained a total of 83,048 and

115,761 sequence reads for mycelial and primordial stages respectively. With an

average read length of 95.5 bases, this accounted for almost 19Mb (18,987,797 bases

to be exact) of sequence. Table 2.1 summarizes the primary data obtained and Figure

2.7 shows the read length distribution of all sequence reads.

Table 2.1 Summary of raw data obtained from GS20 pyrosequencing. Samples Mycelial ditags Primordial ditags Total Reads 83,048 115,761 198,809 Bases 7,934,028 11,053,715 18,987,797 Average read length ^ ^ 95.5

The raw data of the pyrosequencing consisted of 83,048 and 115,761 reads for the mycelial and primordial stages respectively, with an average read length of 95.5 bases for both stages.

40 Mycelial ditags Figure 2.7 Read length ‘ ,參 f I '麵一” distribution of all sequence reads % 仍郎0 激, I • ? for mycelial and primordial I 彻如 ‘~二""丨~""""^X ditags. The average read length was 5圓- :, •…95.5bp for ditags from both stages. The Q '.mm^iL^ length variation between reads was small. 0 m 100 m IteKl tefigth

Primordial ditags 3iX)00 ; 2_0 “‘一,V,她”—一—卞一叫

2_o —— ^^——一 :

1 二,…

侧〜“"‘‘”‘” ‘\乂,‘ 0 ' ''�::� “wJI, ^,。,:〜 0 50 100 150 mM Leiiftti

2.3.4 Tags extraction from ditags

According to the extraction criteria stated in Materials and Methods, a total of

107,046 and 146,369 tags were extracted from the mycelial and primordial 5' SAGE libraries respectively. This represented a tag yield rate of 64.4 % for the former and

63.2% for the latter, and these figures are comparable to a previous study on potato tubers by Nielsen et al. (2006). After grouping the tags using Perl Scripts, 38,366 unique (non-duplicated) mycelial tags and 51,521 unique primordial tags were obtained (Table 2.2).

Among the mycelial tags, 940 unique tags (2.45%) had an occurrence of 10 or above while 4,259 (11.1%) occurred 3 times or above and the unique tags had an average occurrence of 2.79. For the primordial tags, 1,584 unique tags (3.07%) had an occurrence of 10 or above while 7,656 (14.9%) occurred 3 times or above and the

41 unique tags had an average occurrence of 2.84.

2.3.5 Genome mapping and annotation

The mycelial and primordial unique tags were mapped to the C. cinerea genome using the sequences from the Broad Institute. The mapping process resulted in three categories of unique tags: tags matched to a single site on the genome, tags matched to multiple sites of the genome and tags without an exact match. For the mycelial tags, 29,858 out of 38,366 tags (77.8%) matched uniquely to the C. cinerea genome.

2,732 (7.12%) matched to multiple positions and 5,776 (15.1%) could not match to any position.

For the primordial tags, 40,083 out of 51,521 tags (77.8%) matched uniquely to the genome. 3,676 (7.13%) matched to multiple positions and 7,762 (15.1%) could not be matched to any position. The results for tag extraction and genome mapping are summarized in Table 2.2.

For each of the tags matched uniquely to the genome, their positions of mapping were also recorded. The tags were categorized into seven groups depending on the relative positions of the gene model (Table 2.3).

42 Table 2.2 Summary of the tag extraction and genome mapping.

Mycelium 5,SAGE Primordium 5,SAGE Total valid tags 107,046 146,369 Total unique tags 38,366 51,521 Tag occurrence -^ 10 940 (2.45%) 1,584 (3.07%) -23 4,259 (11.1%) 7,656 (14.9%) Genome mapping -Unique match to genome 29,858 (77.8%) 40,083 (77.8%) -Multiple matches to genome 2,732 (7.12%) 3,676 (7.13%) -2-4 matches 2,517 (6.56%) 3,376 (6.55%) -5-10 matches 153 (0.40%) 195 (0.38%) -^11 matches 62 (0.16%) 105 (0.20%) -No match to genome 5,776 (15.1%) 7,762(15.1%)

Table 2.3 Gene-associated positions of tags mapped to the C. cinerea genome. Mycelium 5,SAGE Primordium 5,SAGE Unique tags Occurrence Unique tags Occurrence Putative 5'-UTR a 10,514(35.2%) 49,718 (56.5%) 15,071 (37.6%) 61,964(52.4%) (-1000, -1) Coding region 9,354 (31.3%) 13,201 (15.0%) 12,141 (30.3%) 20,195 (17.1%) (sense) Putative 3’-UTR a 1,531 (5.1%) 2,867 (3.3%) 1,791 (4.5%) 3,031 (2.6%) (+1,+500) Putative 5'-UTR h 1,541 (5.2%) 4,363 (5.0%) 2,089 (5.2%) 8,826 (7.5%) (Anti-sense) Coding region h 3,043 (10.2%) 6,030 (6.8%) 4,306 (10.7%) 8,902 (7.5%) (Anti-sense) Putative 3’-UTR . 1,179 (3.9%) 2,231 (2.5%) 1,492 (3.7%) 2,889 (2.4%) (Anti-sense) Unclassified e 2,696 (9.0%) 9,643 (10.9%) 3,193 (8.0%) 12,450(10.5%) Total 29,858 (100%) 88,053 (100%) 40,083 (100%) 118,257 (100%) a The (-1000, -1) refers to position upstream of the annotated ATG start codon, while (+1’ +500) refers to position downstream of the STOP codon. b 'Anti-sense' means that the tag is in opposite orientation of the annotated 5,-UTR,coding region or 3'-UTR. 43 e 'Unclassified' means that the tags could not be assigned to any annotated gene. The 29,858 mycelial and 40,083 primordial uniquely-matched tags were combined together, and a total of 62,865 non-duplicated uniquely-matched tags were isolated. Of the 62,865 tags, 45,290 (72%) were located in the region l,000bp

upstream of the ATG start codon to 500bp downstream of the stop codon. Among

these 45,290 tags,�47 %were mapped to the putative 5,-UTR, -46% to the coding

region and the rest to putative 3’-UTR. Investigation into the tags mapped to the

coding region showed that they are likely resulted from premature termination during

reverse transcription or degraded mRNA template, because >92% of these tags were

found to occur only once or twice, thereby suggesting that most of them are indeed

spurious. However, tags with higher occurrence may represent genuine unannotated

transcripts on the same strand as increasing evidence has revealed significant

complexity in various genomes in terms of overlapping transcripts (Kapranov et al,

2007).

Considering the tags mapped to the putative 5'-UTR, most of them were located

within 200bp upstream relative to the ATG start codon (Figure 2.8). This distance

was longer than that previously reported in yeast, in which most transcripts start

within 15-75bp upstream of the start codon (Zhang and Dietrich, 2005). However, in

line with their observation, more than 90% of the tags with an occurrence more than

3 were within the first 500bp of the putative 5'-UTR.

A remarkable number of potential anti-sense transcripts were also identified.

Out of the 62,865 tags, 12,153 (19%) were in anti-sense orientation. Approximately

half (53.6%) of them were located in the coding region, with roughly equal

proportions in the putative 5'-UTR (25.9%) and 3’-UTR (20.5%). Moreover, for

approximately 9% of the 62,865 tags, there was no gene annotations for the region

Ikb upstream or downstream and thus were grouped as unclassified.

44 8,812 1 1 1 1 1 1 1 1 1 HVC PRI -——

e.Bi

9.698 - I �.-I 8.884 II

e lee 2BB see • see eee lee see _ leee

Figure 2.8 Distribution of tags mapped to the putative 5,-UTR upstream relative to the start codon. Most transcriptional start sites were located within 200bp upstream of the start codon, and more than 90% of the tags having an occurrence more than 3 were within the first 500bp of the putative 5'-UTR.

45 2.4 Discussion

2.4.1 5' SAGE libraries construction

5’ SAGE allowed simultaneous analysis of the transcriptomes between mycelial and primordial stages and transcription start sites of various genes in the genome of

C. cinerea. This promotes the characterization of differentially expressed genes which may play an important role during fruiting body initiation and development.

Defining transcription start sites also facilitates studies of the promoter and potential regulatory elements.

5' SAGE libraries for the mycelial and primordial stages were successfully constructed. During the construction of the libraries, a number of experimental conditions had to be optimized and a few problems had to be overcome, therefore it took longer time than expected to synthesize the ditags for pyrosequencing.

Pyrosequencing from 454 Life Sciences yielded almost 19Mb of sequences, which demonstrated 95% of the theoretical capacity of the GS20 sequencer. This figure is comparable to previous studies (Gowda et al., 2006; Nielsen et al., 2006). In addition, no average read length data had been reported before, but at 95.5bp average read length, the tag extraction process was not hindered.

2.4.2 Tags extraction and genome mapping

A total of 253,415 valid tags were extracted from 198,809 ditag sequences,

representing a tag yield rate of 63.7%. Most ditags were discarded because they did

not possess the three or more deoxyguanine or deoxycytosine nucleotides added as a

consequence of the template switching primers, or the length of the sequence flanked

by the two Mmel recognition site were not equal to 38-40bp. The latter was probably

a result of improper cutting of Mmel at positions other than the typical 18/20bp

46 downstream of the recognition site, which had also been suggested by Zhang and

Dietrich (2005). As observed in other SAGE libraries, a majority of unique tags

(more than 85%) occurred only once or twice (Hashimoto et al., 2004; Zhang and

Dietrich, 2005). This was also observed in the 5' SAGE dataset, in which 88.9% and

85.1% of tags had an occurrence of only one or two in the mycelial and primordial stages respectively. These tags are usually regarded as results of sequencing errors or premature termination during reverse transcription, but may also emphasize the high sensitivity of 454 high-throughput pyrosequencing to detect rare transcripts that are hardly detectable by the ordinary SAGE protocols or EST-based approaches. In fact, during the analysis of transcription start sites (TSS), we noticed that in a series of

TSSs for a particular gene, some of the start sites were rarely used, thus giving a low occurrence of the correspond tags (Section 2.4.3).

The proportion of tags matched uniquely to genome (-77.8%), matched to multiple positions (�7.12% an) d unable to be matched to the genome (�15.1% wa) s similar for both mycelial and primordial stages. Comparing the proportion of 77.8% of unique matching tags to several reported 5' SAGE libraries including

Saccharomyces cerevisiae (70.8%) (Zhang and Dietrich, 2005), HEK293 human cell

(77.5%) (Hashimoto et al., 2004) and E14 mouse embryonic stem cell (55.8%) (Wei et al., 2004), it showed similar or even higher percentage. This higher percentage may be attributed to a small genome size or little strain sequence discrepancies with the reference strain. On the other hand, in a mapping study of human LongSAGE tags by Keime et al. (2007), it was suggested that tags occurring only once had a

significantly lower mapping percentage than those with multiple occurrence.

However, in this project, this phenomenon was not observed as the mapping rate for

singly occurred tags (77.3% for Myc and 76.8% for Pri) was similar to those

multiply occurred tags (79.6% for Myc and 80.4% for Pri). This suggests that

47 sequencing errors may not be the major source of the singly occurred tags. In this respect, Nielsen et al.’ (2006) also commented on the accuracy of pyrosequencing and they claimed that the overall estimates of sequencing error using pyrosequencing is in fact lower than the traditional Sanger sequencing. Also, the tags are well within the first 90bp of the ditag sequences, which were determined with the highest accuracy (Margulies et al., 2005).

Approximately 15% of the tags could not match to the genome. It is proposed that the major reasons for this are strain sequence discrepancies (as the sequenced strain is a monokaryon while the strain used in the project is a dikaryon), sequencing errors and single nucleotide polymorphisms. As suggested by Keime et al. (2007), a smaller proportion may be a result of tags spanning two exons.

2.4.3 Observations based on the genome mapping data

Considering the tags that matched uniquely to the C. cinerea genome, 72% were mapped to the region from l,000bp upstream of the ATG start codon to 500bp downstream of the stop codon of annotated genes in a sense orientation and 19% were in anti-sense orientation. The percentage for tags mapped in anti-sense orientation to annotated genes is higher than that previously reported in yeast (Zhang and Dietrich, 2005), and this may imply a more complex genome structure in C. cinerea. On the other hand, for the tags mapped in sense orientation and had an

occurrence of 3 or above, approximately 80% of these tags in both mycelial and

primordial stages were mapped to the putative 5'-UTR, thus suggesting that they

likely represent genuine transcription start sites. Based on the genome mapping data,

a number of observations were made.

48 Alternative transcription start sites. There is great interest in elucidating the control

of transcription initiation, as it is one of the major components of gene regulatory

networks that underlie the development and diversity of organisms (Levine and

Davidson, 2005). Core promoter themselves were previously believed to be

functionally simple, but recent data suggested that they are indeed structurally

complex, with a range of alternative TSSs at the base pair level (Kawaji et al., 2006).

As observed in many 5' SAGE and full-length cDNA analyses such as those on yeast

(Zhang and Dietrich, 2005), human (Hashimoto et al., 2004), maize (Gowda et al.,

2006) and Arabidopsis (Alexandrov et al, 2006), alternative transcription start sites

were also discovered in C. cinerea which is signified by an array of closely located

initiation sites. Investigation of the most highly expressed genes in the mycelial and

primordial stages reveals that all of them have more than one TSS and there is

always one or two more preferred TSSs. Such phenomenon is exemplified by the

gene encoding 60S ribosomal protein L24 (GLEAN_00733), which was highly

expressed in the primordial stage (Figure 2.9). The gene was suggested to have 12

transcription start sites and two of them were predominant over the rest.

A key issue to this phenomenon is that whether it is merely a 'biological noise'

resulted from imprecise binding of basal transcription factors or whether TSSs are

precisely regulated. Kawaji et al. (2006) showed that, from extensive Cap Analysis

of Gene Expression (CAGE) studies, TSSs are tissue-specifically utilized, thereby

proposing a new level of biological complexity within promoters and a relationship

to epigenetic transcriptional regulation. Such observation had also been explicitly

demonstrated in the Arabidopsis glutathione-S-transferase F8 (GSTF8) gene, in

which its differential expression and subcellular targeting is achieved through

alternative TSSs. Thatcher et al. (2007) revealed the complexity of the

stress-responsive GSTF8 promoter resulted from the use of multiple TSSs to encode

49 two in-frame proteins differing only in their N-terminal sequence. The most 3’ TSS gives rise to the smaller, major form of GSTF8, whereas the upstream TSSs are more weakly utilized and encode the larger form of the protein. They observed that the smaller form of the protein is highly expressed in the root compared with the leaves and is much more stress-responsive, while the larger form has opposite expression pattern. Moreover, the smaller form is cytoplasmic whereas the larger form is solely targeted to the plastids. Intriguingly, investigation into the GST gene

(GLEAN一 11886) in C. cinerea resulted in similar finding, in which there was a single start site located in the putative 5,-UTR and another start site is likely located within the first exon. Presumably, this may follow certain level of regulation, as depicted in the GSTF8 gene in Arabidopsis.

28100 28200 28300 28400 26500 nlternate GLEAN nodels ^: Jan06w300_GLEftN_00733 Gene: Jan06i»i400_GLEftN_00790 Gene: Jan06.GLEW_00762 GLEPN Pfan donains 5 SflGE Tags flflTCHflCftCHflCTCGfl CTGTTGCflTGCTCTG WGGTCGflftftTTGftCT I > I > I J 12 1 CflflCftCflflCTCGflflCGfl ftGTGGCTGGftGflGX

flflCftWACTCGAACGfl

fffiCftCftftCTCGftftCGflC

ftCGflCGflftCGftCGCC A^GftACGRCGccc Prcdominaiit start sites 14 AC^CGACGCCCTC

ftCGftftCGftCGCCCTCG ^flflCGACGCCCTCGf2 " t

~flftCGflCGCCCTCCflr20 ~ AAC^GCCCTCGACA

flflCGftCGCCCTCGftCftft 1 CCCTCGflCftflftflCTC

CCCTC^CftftftACTCT 1

CTCGffiftftftflCTCTC

CTCGftCftftftftCTCTCT

CT^AAftftCTCTCTC

TCGACAAAACTCTCTC c=> 1

ftCftflffiCTCTCTCftft

ACAAA^TCTCTCAAA Figure 2.9 Alternative transcription start sites in the gene encoding 60S ribosomal protein L24 in primordial stage. The gene contains an array of 12 different TSSs and two of them are predominant over the others (green box). 50 Presence of anti-sense transcripts. Since RNA was suggested decades ago by Jacob and Monad (1961) to bear regulatory roles, a significant number of RNA-based regulatory systems and RNA regulators had been characterized. Previous analyses of the mammalian transcriptomes proposed that up to 20% of transcripts may contribute to sense-antisense pairs (Kiyosawa et al, 2003) and large-scale cDNA sequencing in the Functional Annotation of Mouse 3 (FANT0M3) project suggested that antisense transcription is more widespread. In this study, transcripts which are in anti-sense direction to the 5'-UTR, gene coding region and 3,-UTR were found to account for approximately 20% of all the uniquely matched tags in both developmental stages.

Some of the tags could be identified in both 5' SAGE libraries, implying that they did not occur spuriously (Figure 2.10).

Generally, antisense transcription had been ascribed roles in gene regulation involving degradation of the corresponding sense transcripts, as well as gene silencing at the chromatin level (Katayama et al., 2005). These tags may represent a regulatory mechanism through transcription of non-coding RNAs (ncRNAs) acting in a cis interaction with the target. An obvious advantage of debased RNA signaling, as suggested by Mattick and Makunin (2005), is the ease in localization of interacting RNA components. However, based on global transcriptome analysis, increasing evidence is proposing the possibility that antisense transcripts may link neighboring genes in complex loci into chains of linked transcriptional units. In fact, expression profiling had revealed frequent coincident regulation of sense/antisense pairs (and occasional independent regulation) and experimental evidence showed that perturbation of an antisense RNA can alter the expression of the sense mRNA

(Katayama et al, 2005). These data suggested that antisense transcription can affect

the transcriptional outputs.

51 Riternate GLERN nodels en Gene;Jan06,GLEftN.05241 < 5 SAGE Tags (Hyceliun) WAATGAATGCATCG ^GGTGACGGTTTTflCTG OTCCGTCGTGGTGGTG gGTGGAAAGGGCGG CAATTTTGAATGACAf 1 1 1 J ?

/ / CCTTTCGftCCACCAC / / ? / / CCCCTTGCCTTTCGA / / ? Antisense transcripts / (�GCCCCTTGCCTT Unannotated genes \ \ \\ AGTTAACGCCCCTT- G \

\ \ ^CCGAGTTAACGCCCC \ \ \ ATCACTTACCGAGTTfl \ \ ^ \ 5 SRGE Tags (Prinordiun) \ \ TRTAftTGftfiTGCATCGT CGftCGTCGTCTACGCfiC TGGTGTOfiAGCGTAT AACTTAGAATAAGATGT \ 0 , <3 \

aCCTTGCCTTTCGfl /

^CGCCCCTTGCCTTTC j AftCGCCCCTTGCCTTT / - /

\ AGTTAACGCCCCTTG /

\ ‘

\ AGTTAACGCCCCTTGC

\ “

\ ^CGftGTTftftCGCCCCT

\ flCCGftGTTflftCGCCCC Figure 2.10 Tags in anti-sense orientation found in the coding region of the gene encoding histone H4 in both mycelial and primordial stages. The two antisense tags shared the same sequence, meaning that they do not occur spuriously and may represent an antisense transcript to the sense mRNA, or an unannotated gene as a result of overlapping of transcriptional units in the C. cinerea genome.

New genes prediction. Among the 62,865 unique tags, about 10% of the tags did

not correlate to any annotated genes according to the criteria used in this study.

Among these tags, approximately 10% had an occurrence of 3 or above, thus

suggesting that while most of these tags are probably spurious, some may correspond

52 to unpredicted transcripts or unknown RNA coding genes.

An increasing amount of evidence shows that eukaryote genomes can be transcribed in both strands, implying extensive overlap of transcriptional units and regulatory elements (Kapranov et al., 2007). Given the extent of such transcriptional overlap, it is also possible that some of the anti-sense transcriptions may actually represent transcripts of another functional gene rather than merely a regulatory element (Figure 2.9). Probably differential spatial and temporal expression prevents the complementarity between the two transcripts. Nevertheless, a recent study showed that sense-antisense pairings are also prevalent in human and other eukaryotic species, and many are in fact conserved during evolution (Zhang et al.,

2006).

Improving current genome annotations. In addition to redressing the current genome annotations for missing of potential antisense sense transcripts and novel

genes as commented above, the 5’ SAGE data can also be used to correct some

conceivable mistakes during annotation. One of the major inaccuracies of genome

sequence-based annotation is that it is difficult to precisely predict the ATG start

codon when there are multiple in-frame ATGs close to the 5'-end (Zhang and

Dietrich, 2005). By default, the most upstream 5' ATG is usually assigned as the start

codon. Using the TSS data obtained, 383 and 561 tags from mycelium and

primordium respectively were found to locate in the first 50bp downstream of the

currently annotated start codon, and investigation into the gene model associated

with these tags reveals possible errors in annotation. For a number of these genes, no

or few tags were found in the putative 5'-UTR, and most of the tags were mapped to

the first exon. This is illustrated by a gene encoding a 60S ribosomal protein P2

(Figure 2.11),in which most of the TSS tags are mapped to the first exon rather than

53 the 5,-UTR. Studying the gene model shows that an ATG codon is present in the first three nucleotides of the third exon, and alignment of the protein sequence with other fungal species reveals homology only starting from the third exon. This suggests that the first two exons may be erroneously annotated.

I I I I I I I Ill MIM III) 32.4k 32.5k 32.6k 32.7k 32.8k 32.9k 33k 33.1k 33.2k 33.3k Alternate GLEAN nodels Gene:Ja_300-GLEflN_07258 I I -I 卜 1 Gene:Jan06_-GLEftN__ I I 1 I" 1 1 > Gene:JanQ6.GLEW.0/446 I I CD- 1 I- 1 1 > GLEflH Pfan doitains Do_:Rib_l-60s ) 60s teidic ribosowl protein 5 SAGE Tags (Hyceliun) I <• ODD O O ODD O D O <• Dm c> D on? D, mm DO om mo o 5 SAGE Tags (Prinordiun) I CD mmo D • o DO cd D O O O O DO D I m Dcyo D D O D ODD • mm CD n> o o a o隱 D mm ODO D D•m l=> o• o

Figure 2.11 Possible errors in predicting the exon-intron boundary for the gene encoding 60S ribosomal protein P2. Most of the TSSs are mapped to region within the first exon rather than the 5’-UTR. As an ATG codon is present

at the first three bases of the third exon, and that homology with other fungal species

only starts from the third exon, the exon-intron boundary of this gene may be

erroneously annotated.

54 Chapter 3 Validation of expression patterns of 5,SAGE libraries

and analysis of differentially expressed genes

3.1 Introduction

It is a common practice to validate SAGE data using other molecular techniques such as DNA microarrays (Zhang and Dietrich, 2005) and real-time PGR (Siu et al.,

2001; Cimica et al., 2007). In the first 5' SAGE study, Zhang and Dietrich (2005) had used yeast whole genome gene expression microarray data for calculation of linear regression model using R and shown that the tag abundance was well-correlated to the level of gene expression.

In order to assess the reliability and accuracy of the mycelial and primordial 5'

SAGE data, northern blotting and quantitative real-time PGR analyses were used to examine the expression levels of several selected genes.

Northern blotting (Alwine et al., 1977) was developed primarily for detecting and determining the size of specific RNA molecules, and is now widely employed to study expression patterns of transcripts among different RNA samples. It is basically applicable for the detection of any RNA species for which an appropriate probe is available. The technique involves size-dependent fractionation of total RNA molecules in denaturing agarose gel and the use of 28S and 18S ribosomal RNAs intensities as loading controls of RNA. Also, the length of probes can be manipulated to achieve different specificity of hybridization (Reue, 1998). The expression level of a transcript is quantified by the computer-analyzed band intensities following X-ray

autoradiography or chemiluminescent detection.

The DIG system provides a convenient, nonradioactive means of probe

synthesis. It uses dUTP-coupled digoxigenin, a steroid hapten, to label DNA, RNA,

or oligonucleotides for hybridization. The only prerequisite of this system is that

55 some sequence information of the target sequences is needed for appropriate primer synthesis.

Quantitative real-time PGR, commonly known as real-time PGR, is an effective means for measuring gene expression among multiple samples (Gibson et al., 1996;

Heid et al, 1996). As its name suggests, real-time PGR collects data in real time throughout the PCR process, thereby combining amplification and detection into a single step. Real-time PCR has several benefits over other methods of quantifying gene expression. For instance, it has high and accurate dynamic range (Morrison et al., 1998) and requires no post-amplification manipulation. Sensitivity is also much higher than RNase protection assay (Wang et al, 1999) and dot blot hybridization

(Malinen et al., 2003),and even single copy detection is possible (Palmer et al.,

2003). In addition, real-time PCR requires only small amount of total RNA and

analysis is possible even only partial sequence of gene is available (Stanton, 2001).

A real-time PCR reaction is characterized by the point where the target sequence

is amplified such that the fluorescence intensity surpasses background level and

begins to increase exponentially, thus a higher concentration of target cDNA template

(higher expression level) in the starting material means a faster significant increase in

the fluorescent signal and results in a lower Ct value (Heid et al.,1996). SYBR

Green I is one of the commonly employed DNA binding dyes due to its low cost and

readiness to use (Ramos-Payen et al., 2003). Since DNA binding dyes do not bind in

a sequence-specific manner, accurate results are confirmed by performing melting

curve analysis (Ririe et al, 1997).

Traditionally, gene expression data from real-time PCR needs to be normalized

in order to correct or monitor sample-to-sample variation. This is usually achieved by

comparing the results against a control gene that may also serve as a positive control

for the reaction (Wong and Medrano, 2005). This gene, commonly known as the

56 housekeeping gene, should be expressed in an unchanged fashion regardless of different experimental conditions. As there is no one gene that can meet this criterion for every experimental condition, validation of the expression stability of the gene is

required prior to its application as a housekeeping gene. A few housekeeping genes

for C. cinerea had been suggested before, such as G3PDH (Namekawa et al., 2003)

and Ras (Kikuchi et al., 2004), but they were only tested within a single

developmental stage in a short time course. Therefore, a housekeeping gene across

different developmental stages is yet to be found using northern blotting.

After validation of the expression patterns, the 5' SAGE libraries can be reliably

used to study differentially expressed genes between mycelium and primordium. The

expression level of each gene was first evaluated by considering the tags mapped to

the putative 5,-UTR of the annotated genes, and subsequently the Fisher Exact Test

was used for confirmation of differential expression. The test is commonly used to

examine the significance of the association between two variables, especially when

the chi-square test is not appliable due to a small sample size. Through investigating

these differentially expressed genes, it is possible to gain insights on how they

behave during the fruiting process, thereby providing clues on the mechanisms

underlying the fruiting initiation and development processes.

57 3.2 Materials and Methods

3.2.1 IdentiHcation of housekeeping gene by Northern Blot analysis

3.2.1.1 RNA fractionation by formaldehyde gel electrophoresis

Total RNAs were extracted from dikaryotic mycelium, stage 1 primordium, stage 2 primordium, stage 1 immature fruiting body, stage 2 immature fruiting body

and mature fruiting body by TRI® reagent (Molecular Research Center, Inc.) as

described in chapter 2.

5|Lig of total RNAs of each sample was first fractionated by 1% formaldehyde

gel. The amount of RNA loaded was adjusted using the intensity of 28S rRNA as

loading control. The gel tank, casting tray and comb were treated with 3% H2O2 for

10 minutes before use. The formaldehyde gel was prepared by dissolving 0.4g

agarose in 35ml of DEPC-H2O, and 1.5ml 37% formaldehyde and 5ml lOX MOPS

(0.2M MOPS, 80mM sodium acetate, lOmM EDTA, pH8.0) were added after

cooling down.

The calculated amount of RNA samples were mixed with 7.5|ul 100%

formamide, 2.75|LI1 lOX MOPS, 2|LI1 DEPC-H2O, 3ILI1 lOX loading dye and 0.8|ul

ethidium bromide. The mixtures were denatured at 55°C for 20 minutes and quick

chill on ice before loading. The samples were then loaded and run at 50V for 2 hours.

3.2.1.2 Transfer of RNAs

All forceps, glass rod and scissors were treated with RNase ZAP® (Ambion)

before use. First, the formaldehyde gel was rinsed in two changes of DEPC-treated

lOX SSC (1.5M sodium chloride, 0.15M sodium acetate) for removal of

formaldehyde. PosiBlot 30-30 Pressure Blotter (Stratagene) was used to transfer the

RNAs to nylon membrane (Biodyne® B 0.45|am, Pall Corporation). The membrane,

58 filter papers and the sponge were presoaked in lOX SSC.

The pressure blotter was assembled according to manufacturer's instructions.

The wetted filter paper and nylon membrane are carefully positioned onto the porous

membrane support pad and any wrinkles or air bubbles were smoothened out using

the glass rod. A window was cut at the center of the mask such that the window was

slightly large than the size of the gel by about 1.5cm on each side. The mask was

then placed onto the membrane. Subsequently, the equilibrated gel was placed onto

the mask such that the wells did not fall into the area of the window to ensure an

even fluid flow. On top of the gel, a wetted filter paper and the lOX SSC-saturated

sponge were placed and any bubbles were smoothened out using the glass rod. The

lid of the blotter was then closed and the air compressor was adjusted to a pressure of

75mmHg. The outlet of the compressor was then connected to the blotter inlet port

and the pressure was allowed to rise and reach equilibrium of at least VOmmHg. The

setup was disconnected after the gel was blotted for about 3 hours. The membrane

was then removed and fixed by UV cross linking at 1200 x lO^^J (Stratagene).

3.2.1.3 Probe preparation

Probes are synthesized from Qiagen kit purified PGR products of Cc.Pma,

Cc.G6PDH and Cc.Ras amplified from total cDNA of the mycelial stage.

Approximately lOOng PGR products were used as templates for PGR DIG-labeling

using the PGR DIG probe synthesis kit (Roche). The PGR reactions contained IX

PGR buffer, 1.6mM MgCh, 0.2X DIG PGR labeling solution (Roche), 0.2|LIM

BcaBest M13-47 primer (Takara), 0.2^M BcaBest RV-M primer (Takara), 5U taq

polymerase in a total volume of 30^1. The PGR mixtures then underwent a

programme of denaturation at 94°C for 2 minutes, followed by 35 cycles of

denaturation at 94� Cfor 45 seconds, annealing at 58/54°C for 30 seconds and

59 extension at 72°C for 2 minutes and a final extension at 72°C for 10 minutes. The

DIG-labeled PGR products were checked by 1.5% agarose gel electrophoresis.

Table 3.1 Primers for amplincation of the potential housekeeping gene.

Gene Primers Amplicon size (bp)

5‘-CTCGCCCAAATCGGTTCCTTCTG-3‘ Cc.Pma 820 5‘-CACCAGTGACCATCTTGACCTTG-3‘

5‘-GCATTCAAACAGACCCTATCCTA-3‘ Cc,G6PDH 815 5‘-AACTGGATGCGAACCTCAACCTT-3‘

5‘-GTCGTAGGTGGTGGTGGTATGTT-3‘ Cc.Ras 960 5‘-CGTGGTGCCTGTCCTGGGTGTAG-3,

3.2.1.4 Hybridization, Stringency washes and signal detection

The membrane was first pre-hybridized with 7.5ml hybridization buffer (50%

formamide, 5X SSC, 2X blocking solution, 50mM sodium phosphate, 0.1%

N-lauroysarcosine and 7% (w/v) SDS) at 42°C with gentle shaking for 2 hours in a

hybridization bag. Following pre-hybridization, 7.5ml new hybridization buffer was

added to new hybridization bag with 6|LI1 heat-denatured probes. The hybridization

step took place at 42°C with gentle shaking for 16 hours.

After hybridization, the membrane was washed in 30ml cold washing solution

(2X SSC, 0.1% SDS) twice at room temperature for 15 minutes, followed by two

washes by hot washing solution (0.5X SSC, 0.1% SDS) at 68� Cfor 15 minutes each.

The membrane was then rinsed in 30ml washing buffer (O.IM maleic acid, 0.15M

NaCl, 0.3% v/v Tween®20, pH 7.5) for 2 minutes. The blocking process was

performed by adding 20ml 2X blocking solution (2% blocking reagent, O.IM maleic

acid, 0.15M NaCl, pH 7.5) for 2 hours with gentle shaking. Then l|al 60 anti-digoxigenin AP Fab fragments (Roche) diluted 1:10000 with 10ml 2X blocking solution was incubated with the membrane for 30 minutes at room temperature with gentle shaking. The membrane was then washed with 30ml washing buffer twice for

15 minutes and then 20ml detection buffer (O.IM Tris-HCl, O.IM NaCl, pH 9.5) for 2 minutes. Eventually, the membrane was equilibrated in 1ml CSPD® (Roche) working solution (lOjul in 1ml detection buffer) in a hybridization bag.

Chemiluminescent detection was performed by exposing the sealed membrane to a BioMax Light film (Kodak) in dark room for about 4 hours. The film was photographed and the signals were quantified with Kodak IDS.5.3 Image Analyses programme (Kodak).

3.2.2 Quantitative real-time PGR

3.2.2.1 cDNA synthesis from 2 developmental stages

Total RNAs were extracted from dikaryotic mycelium and stage 1 primordium as described in chapter 2.

0.5|Lig of total RNAs from each developmental stage was used to synthesize cDNA for real time PGR analysis. The RNAs were first treated with DNasel to remove any contaminating DNA. 0.5|Lig RNA was mixed with 0.5|LI1 DNasel (New

England Biolabs), O.S^il DNase buffer and total volume was brought up to 5|il with

DEPC-H2O. The reaction was first incubated at 25� Cfor 15 minutes and then 65°C for 10 minutes after addition of 0.5|J1 25mM EDTA (Invitrogen).

DNase treated RNAs were transcribed into first strand cDNA using TaqMan®

reverse transcription reagents (Applied Biosystems) according to manufacturer's

protocol. 5.5|il of DNasel treated RNAs were added to 2.5JLI1 of lOX RT buffer, 5.5|ul

of 25mM MgCl2, 5\x\ of dNTP (2.5mM each), 0.625|LI1 of 50|uM random hexamer,

0.625|al of 50|LIM oligo(dT), 0.5|LI1 RNase inhibitor (20U/iil), 0.625|LI1 of MultiScribe®

61 Reverse Transcriptase (50U/|LI1) and 4.125|LI1 of DEPC-H2O. The reaction was incubated at 25°C for 10 minutes, followed by 48°C for 45 minutes and 94� Cfor 5 minutes.

3.2.2.2 Primer design and verification

Gene expression differences were categorized into 3 categories. They are (1)

Expression in mycelium higher than in primordium, fold difference greater than 1.5;

(2) Expression in mycelium similar to primordium, fold difference between 1.1 and

0.9; (3) Expression higher in primordium than in mycelium, fold difference smaller than 0.66. From each of the categories, four genes were selected for real-time PGR assays. Therefore, there were a total of twelve sets of primers to be used (Table 3.2).

Primers for real time PGR analysis were designed using OLIGO™ 4.0 software

(Molecular Biology Insights, Inc). The primers were selected such that they had similar melting temperatures, were hairpin-free and the size of amplicons was about

100-150bp. They were then verified by setting up a 20|ul PGR reaction containing

2[i\ of lOX PGR buffer, 1.2|LI1 of 25mM MgCh, ljul of dNTP (lOmM each), Ijul of

lOjaM designed upper primer and lower primer, 2|ul of lOX diluted cDNA, 0.25|ul of

Taq polymerase (5U/|LI1) (Promega) and 11.55|LI1 H2O. The reactions underwent a

PGR program of 95°C for 3 minutes, followed by 50 cycles of 95� Cfor 15 seconds,

60� Cfor 45 seconds and 72°C for 15 seconds. PGR products were collected at the

30th AND 50th cycles. 5|LI1 of the products were analyzed on 1.5% agarose gel for

quantity and quality assessment.

62 Table 3.2 Primers for quantitative real time PCR analysis. 5,SAGE tags Copy no.c Abundance^

Code^ Geneb Myc Pri Myc Pri Fold difference^ Primers

MlThioredoxin 119 ^ ^ ^ ATGGGCGTCA( GAGATGATTC( M2 Tetraspannin 2082 975 1.94 0.67 2.91 5'- TGCTCGGCTK 5TGTAGTTGGG( M3 Vipl protein 93 77 0.09 0.05 1.65 5'— ACGGAGCATG: 5AATTGCCCGAC M4 Subtilisin N-terminal 149 104 0.14 0.07 1.96 5'- ACAACGTCAGC region 5 ‘ - TGCAAACGCGi 51 Clathrin adaptor 71 98 0.07 0.07 0.99 TCTTTTTCTGC complex small chain 5 GAATGGCATAC 52 ATP synthase 82 117 0.08 0.08 0.96 5'- CCATTAAGACC oligomycin sensitivity 5 ' - CAGAGAGGAGi conferral protein 53 60S ribosomal protein 312 488 0.29 0.33 0.87 AAAGACCGTCC L34 5'- TGCTGCGACT: 54 Ubiquitin fusion 146 221 0.14 0.15 0.9 5CTCCACTTGG: protein 5 ‘ — GGGCGTAGCA"]

63 Table 3.2 (Con,t) ~~Pl40S ribosomal protein^ ^ ^ ^ 5'- ATCACTGCCC S14 ATACCAGCAC( P2 Prohibitin HPBl 36 113 0.03 0.08 0.44 GGACTATGAC( 5'- GCTGGCGGATi

P3 ATP synthase delta 100 336 0.09 0.23 0.41 TTCGCCACCG' chain 5 ‘ - TCAGCGAGGT' P4 Basic leucine zipper 28 154 0.03 0.11 0.25 CCTTCTTCCC and W2 domain 2 CTTTCCTTGA^

a Codes represent the corresponding primer set. 'M' for expression in mycelium higher than in primordium; 'S' for expression in

expression in primordium higher than in mycelium,

b Identification of the GLEAN model from BLASTx search of NCBI.

e Copy number of the corresponding GLEAN model of the gene as revealed from the 5' SAGE libraries,

d % Abundance of the GLEAN model is calculated by dividing the copy no. of the GLEAN model by total tag counts x 100%

e Fold difference is calculated by dividing the abundance in mycelium by the abundance in primordium.

f Primers designed using OLIGO™ 4.0 software, upper one is the forward primer and lower one is the reverse primer.

64 3.2.2.3 Real time PCR reaction and data analysis

Real time PCR reaction was performed on Bio-Rad MiniOpticon™ real time system (Bio-Rad). 2|LI1 of lOX diluted first strand cDNA was mixed with 0.45|LI1 of each lO^iM sequence specific primer, 10|LI1 of 2X iQ™ SYBR® Green Supermix

(Bio-Rad) and brought up to 20|LI1 with nuclease-free water according to manufacturer's manual. The reaction for each primer set was performed in duplicate and no-template-control (NTC). The real time PCR program was 95 °C for 1 minute, followed by 40 cycles of 95� Cfor 15 seconds, 60� Cfor 45 seconds and 12� Cfor 15 seconds. Melting curve analysis was also performed by increasing the temperature from 50� Cto 90�C Dat. a analysis was performed on Opticon Monitor™ Version 3.0

(Bio-Rad). The experiments were repeated at least three times with independent biological samples.

3.2.3 Gene expression level comparison

The total occurrence for each particular gene (GLEAN model) was obtained by summing up the occurrence of all the tags lOOObp upstream of the start codon. In other words, the tag count for every tag found lOOObp upstream of each gene is allocated to the expression level of the respective gene. The annotation for each gene is determined by BLASTX search, supplemented by BLASTP search. An e-value at an order of 10'^ was used as a criterion for confirmation of homology. Expression abundance of a gene in the respective stage was calculated by dividing the corresponding occurrence by the total tag occurrence of the respective stage.

Differentially expressed genes were determined using Fisher Exact Test. A 2-tail p-value of 0.05 was used as confirmation of differential expression. The fold

difference was calculated using the expression abundance value. For the sake of

convenience, genes with an occurrence of 0 at any stage were assigned an occurrence

65 of 0.5 so that the fold difference between the two stages could be calculated.

The differentially expressed genes were also annotated by Gene Ontology (GO) terms, a collaborative database used for describing and grouping of gene products into three different independent categories (ontologies) in terms of cellular components, molecular functions and biological processes. The annotations reflect the normal function, process and localization of a gene product and the annotation of the gene product to one ontology is independent of its annotation to others ontologies.

The complete set of GO annotations was downloaded from the GO website

(http://www.geneontology.org/GO.downloads.shtml) and the freely-available

software Gene Ontology Browsing Utility (http://gobu.iis.sinica.edu.tw) was used to

visualize the GO terms of the differentially expressed genes.

66 3.3 Results

3.3.1 Identification of housekeeping gene by Northern Blot analysis

Northern blotting was performed to identify a housekeeping gene across different developmental stages of C. cinerea. Total RNAs were extracted from 6 stages, including mycelium, stage 1 primordium, stage 2 primordium, stage 1 immature fruiting body, stage 2 immature fruiting body and mature fruiting body.

The RNAs were first fractionated on formaldehyde denaturing gel and then transferred to nylon membrane for Northern blotting using DIG-labeled cDNA probes. Equal loading of RNAs was monitored by similar intensities of the 28S and

18S rRNAs.

Three genes were tested, namely plasma membrane H+ ATPase�Cc.Pma\ glucose-6-phosphate dehydrogenase (Cc.G6PDH) and Cc.Ras. Northern blotting for each of the three genes was repeated for three times. It was found that all the three genes were not constantly expressed during the life cycle of C. cinerea from mycelium to mature fruiting body. Even when only the mycelium and stage 1 primordium stages were compared, expression was not constant as well. For

Cc.G6PDH, expression decreased gradually from mycelium to stage 2 primordium, then reached the highest level at stage 1 immature fruiting and then decreased again

(Figure 3.1). For Cc.Pma, expression pattern was similar to that of Cc.G6PDH

during transition from mycelium to stage 2 primordium. Starting from stage 1

immature fruiting body, expression started to rise again and reached the maximum at

mature fruiting body (Figure 3.2). For Cc.Ras, overall expression pattern was quite

similar to that of Cc.G6PDH, except that expression was slightly higher in stage 2

primordium than stage 1 primordium (Figure 3.3).

67 Myc SI Pri S2 Pri SI IFB S2IFB MFB

CC.G6PDH IM . MKM ^^""�1.8kb

�� -28S rRNA I ‘ ‘ -18S rRNA

Expression ratio

Myc SI Pri S2 Pri SI IFB S2 IFB MFB

1 0.31 0.23 3.05 0.96 0.09

Figure 3.1 Northern blotting of the transcripts of Cc.G6PDH at six different developmental stages of C. cinerea. Expression ratios are calculated from the relative expression level among different stages, by taking the level in mycelium as 1. The expression of Cc.G6PDH decreased gradually from mycelium to stage 2 primordium, then reached the highest level at stage 1 immature fruiting and then decreased again.

Myc: mycelium; SI Pri: stage 1 primordium; S2 Pri: stage 2 primordium; SI IFB: stage 1 immature fruiting body; S2 IFB: stage 2 immature fruiting body; MFB: mature fruiting body.

68 Myc SI Pri S2 Pri SI IFB S2 IFB MFB

B^illiiiiilli fir ‘ i || iiiiiillH

Expression ratio

Myc SI Pri S2 Pri SI IFB S2 IFB MFB

1 0.37 0.17 1.34 2.04 3.61

Figure 3.2 Northern blotting of the transcripts of Cc.Pma at six different developmental stages of C. cinerea. Expression ratios are calculated from the relative expression level among different stages, by taking the level in mycelium as 1. The expression of Cc.Pma decreased gradually from mycelium to stage 2 primordium, then started to rise again in stage 1 immature fruiting body and reached the maximum in mature fruiting body.

Myc: mycelium; SI Pri: stage 1 primordium; S2 Pri: stage 2 primordium; SI IFB: stage 1 immature fruiting body; S2 IFB: stage 2 immature fruiting body; MFB: mature fruiting body.

69 Myc SlPri S2Pri SI IFB S2IFB MFB

Cc. Ras III-frill fi"ni�rii"

I 毒 # . \ , “,+一 i 彳-ISSrRNA

Expression ratio

Myc SlPri S2 Pri SI IFB S2 IFB MFB

1 0.45 0.37 1.33 0.78 0.14

Figure 3.3 Northern blotting of the transcripts of Cc.Ras at six different developmental stages of C. cinerea. Expression ratios are calculated from the relative expression level among different stages, by taking the level in mycelium as 1. The expression of Cc.Ras decreased from mycelium to stage 1 primordium and roughly maintained in stage 2 primordium, then reached the highest level at stage 1 immature fruiting and then decreased again.

Myc: mycelium; SI Pri: stage 1 primordium; S2 Pri: stage 2 primordium; SI IFB: stage 1 immature fruiting body; S2 IFB: stage 2 immature fruiting body; MFB: mature fruiting body.

70 3.3.2 Quantitative real-time PGR analysis

Quantitative real-time PGR assays were used to assess the reliability and accuracy of the 5,SAGE libraries. From each of the three categories stated in chapter

3.2.2.2, four genes were selected and thus a total of twelve genes were tested.

Before real-time PGR, the primers were first verified by PGR reaction with similar settings as real-time PCR except that no SYBR® Green was added. PGR products were collected at the and 50出 cycle respectively and gel electrophoresis revealed the presence of a single band and a greater amount of products at the cycle (Figure 3.4). Absence of non-specific amplification and primer-dimers during real-time PCR were confirmed by melting curve analysis on Bio-Rad MiniOpticon™ real time system. For all the final products, only a single peak was observed from the plot of the negative first derivative of fluorescence intensity over time and all the peaks occurred at above 70�C Thes. e indicated the absence of contaminants in the real-time PCR products (Figure 3.5).

The real-time PCR performed analyzed the expression of 12 genes between mycelium and stage 1 primordium. These genes were suggested by the 5' SAGE data to be either differentially expressed or similarly expressed.

As it is found that the expression of clathrin coat assembly protein was constant between mycelial and primordial stages in three different biological samples (Table

3.3),thus Northern blotting was performed to confirm its constant expression among these two stages. Results showed that the gene was indeed constantly expressed

(Figure 3.4) and that a housekeeping gene had not yet been found, therefore the threshold cycle (CO for each gene was normalized against the Ct of clathrin coat

assembly protein. For the categories 'Expression in mycelium higher than in

primordium' and 'Expression in mycelium similar to primordium', the expression

levels in primordium were taken to be 1. While for the category ‘Expression in

71 primordium higher than in mycelium', expression levels were in mycelium were taken to be 1.

Table 3.3 Threshold cycle detected from real-time PGR for the clathrin coat assembly protein gene.

Sample 1 Sample 2 Sample 3

Mycelium 22.15 25.13 22.51 Stage 1 primordium 22.18 25.16 22.46

Myc SI Pri

• ‘--.Hiiilit < O.Skb

j - 28S rRNA I - ��� - i ^^^iJ-lSSrRNA

Expression ratio

Myc SI Pri

1 0.94

Figure 3.4 Northern blotting of the transcripts of encoded by the clathrin coat assembly protein gene at mycelial and primordial stages of C. cinerea. Expression ratios are calculated from the relative expression level between the two stages, by taking the level in mycelium as 1. The expression level was similar among mycelium and stage 1 primordium.

Myc: mycelium; SI Pri: stage 1 primordium

72 一國 (b) �、,變ff :’-rTt�斜:,:广;、y��?丄芬-仰々•,,卿

..•iWl

Figure 3.5 Verification of 12 sets of real-time PGR primers on 1.5% agarose gel electrophoresis, (a) PGR products collected at 30出 PGR cycle. Lane 1: lOObp ladder; lane 2-5: primer set M1-M4; lane 6-9: primer set S1-S4; lane 10-13: primer set P1-P4. (b) PGR products collected at 50'^ PGR cycle. Lane 1: lOObp ladder; lane 2-5: primer set M1-M4; lane 6-9: primer set S1-S4; lane 10-13: primer setPl-P4.

73 (a) n

so SS 9f. IT仰P权… (b) n

•7S ^ es 9t I 一 per 咖叫 (c )

70 , 80 SS

Figure 3.6 Melting curve analysis for 12 real-time PCR products, (a) Melting curve for products amplified from primer sets M1-M4. (b) Melting curve for products amplified from primer sets S1-S4. (c) Melting curve for products amplified from primer sets P1-P4. A single peak was observed for all products amplified by the primer sets.

74 (a) (b) Results for thioredoxin Results for tetraspannin

..�[T

Myc SI Pri Myc SI Pri Developmental stages Developmental stages 5,SAGE results 5,SAGE results Myc SI Pri Myc SI Pri Occurrence 119 28 Occurrence 2082 975 % Abundance 0.24 0.05 % Abundance 4.19 1.57

(c) (d)

Results for Vipl protein Results for Subtilisin N-terminal region

2.5 r 2.5 r

i�:三[EE Myc SI Pri Myc SI Pri Developemental stages Developmental stages 5,SAGE results 5,SAGE results Myc SI Pri Myc SI Pri Occurrence 93 77 Occurrence 149 104 % Abundance 0.19 0.12 % Abundance 0.30 0.17

Figure 3.7 The relative expression level ratio of (a) Thioredoxin, (b) tetraspannin, (c) Vipl protein, (d) subtilisin N-terminal region, from real-time PGR analysis. The relative expression level was calculated by taking the expression level in stage

1 primordium as 1. Results were generated from three independent RNA samples and each PGR

reaction was in duplicate.

Myc: mycelium; SI Pri: stage 1 primordium.

75 (a) Results for AT? synthase oligomycin sensitivity confeml (b) Results for 60S ritomal protein L34 protein

�—1. 2J , '•; [ — Fl PE^ f :

50.4W ——rr 一 so-4——I r —— 10.2 f — ——I0.2—‘ : 一 p£; h PC; 0 1 ‘ ‘ 0 ^ ‘ ‘ Myc SlPn Myc SI Pri Developmental stages Developmental stages 5,SAGE results 5,SAGE results Myc SI Pri Myc SI Pri Occurrence 82 117 Occurrence 312 488 % Abundance 0.16 0.19 % Abundance 0.63 0.78 (c) Results forUbiquitin ftisionprotei n

1.2

—: P, — 1 ——

0 ‘ ‘ ^ ‘ Myc SI Pri Developmental stages 5,SAGE results^ Myc SI Pri Occurrence 146 221 % Abundance 0.29 0.36

Figure 3.8 The relative expression level ratio of (a) ATP synthase oligomycin sensitivity conferral protein, (b) 60S ribosomal protein L34, (c) Ubiquitin fusion protein, from real-time PCR analysis. The relative expression level was calculated by taking the expression level in stage 1 primordium as 1. Results were generated from three independent RNA samples and each PCR reaction was in duplicate. Myc: mycelium; SI Pri: stage 1 primordium.

76 (a) Results for40S ribosomal proteins 14 (b) Results forProhibitin HPBl

2 3.5 :li : 一 三 i:=,= > 0.6 m i — .髮 1 Hvi 『( —— I 0.^ 國,[z: I 0.3 — Myc SI Pri Myc SI Pri Developmental stages Developmental stages

5,SAGE results 5,SAGE results Myc SI Pri Myc SI Pri Occurrence 102 282 Occurrence 36 113 % Abundance 0.21 0.46 % Abundance 0.07 0.18

(c) (d)

Results for ATP synthase delta chain Results for Basic leucine zipper and W2 domain 2

3厂 5厂 j" _卜 r'i 蒙 E p y — li f — U-W] r = i'.丨,1 r 二 0 ‘―—— ‘ ‘ 0 ^ ‘ ^ ‘ Myc SI Pri Myc SI Pn Developmental stages Developmental stages 5,SAGE results 5,SAGE results Myc SI Pri Myc SI Pri Occurrence 100 336 Occurrence 28 154 % Abundance 0.20 0.54 % Abundance 0.06 0.25

Figure 3.9 The relative expression level ratio of (a) 40S ribosomal protein S14,(b) Prohibitin PHBl, (c) ATP synthase delta chain, (d) Basic leucine zipper and W2 domain 2,from real-time PCR analysis. The relative expression level was calculated by taking the expression level in mycelium as 1. Results were generated from three

independent RNA samples and each PCR reaction was in duplicate.

Myc: mycelium; SI Pri: stage 1 primordium.

77 3.3.3 Gene expression level comparison

The expression level of each gene was determined by summing up the occurrence of all the tags found lOOObp upstream. In this way, the 29,858 unique mycelial tags were assigned to 4,697 distinct genes and the 40,083 primordial tags were assigned to 5,594 distinct genes. The total tags pool together identified 6,736 distinct genes. The percentage abundance of the gene was calculated by dividing the total occurrence of each gene by the total count of tags mapped to the putative

5,-UTR (49,718 for mycelium and 61,964 for primordium). The 150 most expressed genes in the two stages are summarized in Table 3.3 and 3.4.

On the whole, the top 150 expressed genes accounted for 66% of the

transcriptome of the mycelium while they accounted for only 50% in the primordium.

For the most expressed mycelial genes, 3 copies of a gene encoding a mismatched

base pair and cruciform DNA recognition protein were among the top five expressed

genes. Together with two uncharacterized genes, the 5 genes comprised almost 25%

of the transcriptome in mycelial stage. This is in contrast with that in the primordial

stage, in which expression of the top 30 genes is required to give the same

percentage. As reported from previous literature (Asgeirsdottir et al., 1998; Ng et al.,

2000),the hydrophobins are highly expressed in both the mycelium and the

primordium. Three genes encoding hydrophobin CoHl were identified to be highly

expressed in the mycelium while three hydrophobin-coding genes were also

identified in the primordium. Interestingly, the three CoHl genes and two of the

hydrophobin genes in primordium were found to be clustered in adjacent loci

respectively.

For the primordial stage, the enzyme ribitol kinase had the highest expression

level, comprising �1 %of the transcriptome. The ribosomal proteins were highly

expressed as well, and among the 120 proteins (out of 150) in which a homolog had

78 been identified, 44 of them corresponded to ribosomal proteins and comprised �7.5% of the transcriptome. Attention is also paid to the Cgl3 lectin. Compared with the other two well characterized galectins Cgll and Cgl2 (Cooper et al., 1997; Boulianne et al. 2000), Cgl3 received few recognition but it is indeed highly expressed in the primordium.

79 Table 3.4 Summary of the top 150 most highly expressed genes in the mycelial stage.

GLEAN % No. Occurrence'' Accession ID Annotation modeP abundance''

1 3070 3747 7.54 2 779 3486 7.01 CAB85690 Mismatched base pair and cruciform DNA recognition protein Agaricus 3 9275 2082 4.19 4 9765 1951 3.92 XP_569409 Hmpl protein Cryptocc 5 767 1102 2.22 CAB85690 Mismatched base pair and cruciform DNA recognition protein Agaricus 6 9938 1029 2.07 CAG29170 Copper transporter Pleurotu. 7 6650 617 1.24 8 3725 589 1.18 9 8487 587 1.18 XP_001274747 Ribitol kinase Magnapc 10 7677 537 1.08 CAA71652 CoHl Coprinop 11 10060 529 1.06 12 10036 506 1.02 13 833 482 0.97 XP_001261781 Hemerythrin HHE cation binding domain protein Neosarto 14 7676 477 0.96 CAA71652 CoHl Coprinop 15 5468 424 0.85 XP_568367 Rab GTPase activator Cryptocc 16 11350 415 0.83 17 2814 383 0.77 18 8584 364 0.73 XP_569612 40S ribosomal protein S12 Cryptocc 19 6946 344 0.69 20 8468 343 0.69 21 868 312 0.63 XP—567178 60s ribosomal protein 134-b Cryptocc 22 7754 312 0.63 CAD 10794 Putative ribosomal protein S19 Pleurotu� 23 11577 312 0.63 24 764 ^

80 Table 3.4 (continued) 25 898 301 0.61 XP一956090 Conidiation-specific protein 6 Neurospc 26 1366 254 0.51 NP_984607 AEL254Wp Ashbya g 27 7036 239 0.48 28 733 237 0.48 XP_571980 60S ribosomal protein L24 (L30) Cryptocc 29 7037 234 0.47 30 602 233 0.47 NP_740781 60s ribosomal protein 117 Cryptocc 31 6473 233 0.47 32 12499 219 0.44 33 7947 210 0.42 XP一569827 Structural constituent of ribosome Cryptocc 34 11089 200 0.40 35 8241 194 0.39 36 10885 187 0.38 EAU91281 40S ribosomal protein SI8 Coprinop 37 1791 180 0.36 AAY85811 40S ribosomal protein S27 Chaetom 38 7074 179 0.36 XP_569366 PRCDNA38 Cryptoco 39 7898 171 0.34 40 12520 167 0.34 41 1750 163 0.33 XP一572717 60S ribosomal protein LI9 Cryptoco 42 10955 160 0.32 CAD 10795 Putative phosphatidic acid phosphatase Pleurotw 43 1500 151 0.30 44 10462 149 0.30 1ITP_A Chain A, Solution Structure Of Poial Pleurotu: 45 2008 146 0.29 XP_751209 RNA binding protein, putative Aspergill 46 11833 146 0.29 EAU92224 Ubiquitin Coprinof. 47 2467 137 0.28 NP—986927 PREDICTED: similar to phosphatidylethanolamine-binding protein Nasonia 48 6344 137 0.28 NP_986927 AGR261Wp Ashbya g 49 63 134 0.27 EAU83175 Serine/threonine-protein phosphatase PP2A-2 catalytic subunit Coprinof 50 9523 133 0.27 XP_568154 Gal4 DNA-binding enhancer protein 2 Cryptoco 51 7246 m 0.26 XP_001248452 40S ribosomal protein S15 Coccidio

81 Table 3.4 (continued)

52 8486 131 0.26 53 2775 129 0.26 XP_572112 40s ribosomal protein s23 Cryptocc 54 8189 121 0.24 55 1507 119 0.24 XP_001261730 Thioredoxin, putative Neosartc 56 2622 112 0.23 57 11858 112 0.23 EAU93093 40S ribosomal protein SI 1 Coprinoi 58 12181 110 0.22 XP—571929 Cytochrome-c oxidase Cryptocc 59 7350 103 0.21 AAC69196 40S ribosomal protein S8 Schizoph 60 9822 102 0.21 EAU86619 40S ribosomal protein S14 Coprinoi 61 2031 101 0.20 XP—572711 40S ribosomal protein S7 Cryptocc 62 863 100 0.20 63 6814 100 0.20 Q92196 ATP synthase delta chain, mitochondrial precursor Agaricus 64 6947 100 0.20 65 5205 99 0.20 XP_001264326 88 kDa immunoreactive mannoprotein MP88 Cryptocc 66 7323 99 0.20 XP—567104 Ribosomal Protein S2 Cryptocc 67 9958 99 0.20 XP_571691 Fasciclin domain family Neosartc 68 9312 98 0.20 XP_571982 Glycine-rich RNA binding protein Cryptocc 69 5354 97 0.20 70 3562 96 0.19 XP_001212923 60S ribosomal protein L20 Aspergih 71 9297 96 0.19 XP—001356719 Guanine nucleotide binding protein beta subunit Lentinuk 72 5628 93 0.19 XP—001275305 Actin cytoskeleton protein (VIP 1), putative Aspergih 73 2575 88 0.18 EAU91281 Structural constituent of ribosome Cryptocc 74 5233 88 0.18 75 10888 88 0.18 XP_571207 40S ribosomal protein SI8 Coprinoi 76 12124 83 0.17 XP—567416 60s ribosomal protein 111 Cryptocc 77 1128 82 0.16 093931 ATP synthase oligomycin sensitivity conferral protein, putative Aspergill 78 10794 ^ q^ XP_956517 40S ribosomal protein S26 Schizoph

82 Table 3.4 (continued) 79 6108 81 0.16 XP_569713 Hydrogen-transporting ATP synthase Cryptocc 80 838 79 0.16 XP_566913 PRCDNA87 Cryptocc 81 2177 79 0.16 XP_566824 Defender against cell death 1 (dad-1) Cryptocc 82 11344 77 0.15 83 2369 75 0.15 XP_001259479 Ran-specific GTPase-activating protein 1, putative Neosarto 84 4001 75 0.15 85 11825 73 0.15 86 181 71 0.14 BAC78820 Ubiquitin-conjugating enzyme9 Coprinop 87 3385 71 0.14 AAP20199 60S ribosomal protein L44 Coprinop 88 5551 71 0.14 XP_571764 LlOe protein Cryptocc 89 6095 71 0.14 EAU81446 Vesicle-mediated transport-related protein Cryptocc 90 6827 71 0.14 XP_568690 40S ribosomal protein S5 Pagrus n 91 2002 70 0.14 NP_001049239 0s03g0192400 Oryza sa 92 5084 70 0.14 P62792 Histone H4 Phaneroi 93 8431 69 0.14 XP_755660 Alcohol dehydrogenase, putative Aspergill 94 65 68 0.14 XP_001217810 Alcohol oxidase Aspergill 95 888 68 0.14 96 3305 66 0.13 ZP_01112629 Protein containing QXW lectin repeats Reinekea 97 10726 65 0.13 98 8255 62 0.12 XP_(X)1267852 Mitochondrial ATP synthase epsilon chain domain-containing protein Aspergill 99 1787 61 0.12 100 6714 60 0.12 101 7750 60 0.12 Q9UVF8 Thiazole biosynthetic enzyme, mitochondrial precursor Uromyce 102 750 57 0.11 XP_569162 Ribosomal LI0 protein Cryptoco 103 3932 57 0.11 ZP_01352093 Glycosyl hydrolase, family 88 Clostridi 104 7518 57 0.11 ABB72849 Eukaryotic ADP/ATP carrier Cryptoco 105 9300 ^ ^JJ NP_033102 Ribosomal protein L12 Mus mus^

83 Table 3.4 (continued)

106 1327 55 0.11 ABD64675 CGL3 lectin Coprino, 107 7675 55 0.11 CAA74987 CoHl Coprino‘ 108 10509 55 0.11 XP—001384926 ATP synthase FO sector subunit 4 Pichia si 109 307 53 0.11 XP_319810 Glia maturation factor beta Cyprinu. 110 3431 52 0.10 XP一568864 40S ribosomal protein S17 Aspergil 111 7513 52 0.10 XP_663583 PRCDNA95 Cryptocc 112 995 51 0.10 113 9570 51 0.10 AAX51843 Glutamate decarboxylase Paxillus 114 10445 51 0.10 AAQ16628 Manganese superoxide dismutase Taiwanq 115 1416 50 0.10 AAP36848 Tubulin-specific chaperone a Homo so 116 9976 50 0.10 EDN63827 Conserved protein Sacchan 117 3814 49 0.10 YP_412388 Phosphoesterase, PA-phosphatase related Nitrosos, 118 3858 49 0.10 XP_001270808 Glutathione S-transferase, putative Aspergil 119 4141 49 0.10 120 9024 49 0.10 121 10029 49 0.10 XP_567959 GTP-binding protein Yptl IMilago 122 10749 49 0.10 ABM55612 Bud site selection-related protein Cryptocc 123 1528 48 0.10 AAD43253 Glutaredoxin-C8 precursor Glutaredoxin-C4 homolog Gracilar 124 6494 48 0.10 125 226 47 0.09 XP_572088 12 kda heat shock protein (glucose and lipid-regulated protein) Cryptocc 126 1979 47 0.09 XP_001270561 LYR family protein Aspergil 127 4879 46 0.09 128 5285 46 0.09 EAU89607 PREDICTED: similar to PASG Nasonia 129 7783 46 0.09 130 8322 46 0.09 EAU89607 60S ribosomal protein L27a Coprinoi 131 9891 46 0.09 XP—570752 Mitochondrial fission-related protein Cryptocc 132 416 45 ^ XP_570275 Oxidoreductase Cryptocc

84 Table 3.4 (continued) 133 524 45 0.09 EAU92889 RAB GDP-dissociation inhibitor Cryptocc 134 6375 45 0.09 XP_571732 Ubiquitin Coprinoi 135 1116 44 0.09 136 2570 44 0.09 ABE89643 Short-chain dehydrogenase Aspergili 137 9292 44 0.09 ZP_00345738 COG2814: Arabinose efflux permease Nostoc p 138 9766 44 0.09 XP_001273658 Aldehyde dehydrogenase Medicag. 139 2513 43 0.09 ZP_01521948 ThiJ/PfpI Comamo 140 3224 43 0.09 XP_001387403 60S large subunit ribosomal protein L2A Pichia st 141 7751 43 0.09 NP_080276 Proteasome 26S non-ATPase subunit 9 Mus mus 142 8240 43 0.09 143 8507 42 0.08 144 10627 42 0.08 145 12063 42 0.08 XP—568483 (R,R)-butanediol dehydrogenase Cryptoco 146 8708 41 0.08 147 229 40 0.08 XP—746575 60S ribosomal protein L7 Aspergili 148 341 40 0.08 XP—001269398 Ketoreductase, putative Aspergili 149 2126 40 0.08 EAU84422 DNA damage checkpoint protein rad24 Coprinof 150 9795 ^ 0.08 XP一001261781 Hemerythrin HHE cation binding domain protein Neosarto

a GLEAN model number is in the form of Gene:Jan06m300_GLEAN_XXXXX during inquiry in the C. cinerea genome annotations b Occurrence of the gene model is given by the summation of the counts of tags found Ikb upstream.

c % abundance is calculated by dividing the occurrence of the gene by 49,718 (the total counts of tags mapped to putative 5'-UTR in d The annotation is left blank if the gene had no homologous proteins or corresponded to unknown protein, e An e-value of l.OOE-5 is used as confirmation of protein homology.

85 Table 3.5 Summary of the top 150 most highly expressed genes in the primordial stage.

GLEAN % . d No. Occurrence'' Accession ID Annotation model® abundance*^

1 8487 1381 2.23 XP_362846 Ribitol kinase Magnapc 2 9275 975 1.57 3 10060 890 1.44 4 10036 865 1.40 5 733 805 1.30 XP_571980 60S ribosomal protein L24 (L30) Cryptoco 6 5468 766 1.24 XP—568367 Rab GTPase activator Cryptoco 7 7754 764 1.23 CAD 10794 Putative ribosomal protein S19 Pleurotw 8 8584 657 1.06 XP_569612 40S ribosomal protein S12 Cryptoco 9 5074 566 0.91 10 1366 561 0.91 XP—570887 60s ribosomal protein 121-a Cryptoco 11 868 488 0.79 XP一567178 60s ribosomal protein 134-b Cryptoco 12 10846 476 0.77 13 2775 471 0.76 XP_572112 40s ribosomal protein s23 Cryptoco 14 7947 440 0.71 XP_569827 Structural constituent of ribosome Cryptoco 15 602 434 0.70 XP_568654 60s ribosomal protein 117 Cryptoco 16 1327 374 0.60 ABD64675 CGL3 lectin Coprinop 17 10885 372 0.60 EAU91281 40S ribosomal protein SI8 Coprinop 18 6344 360 0.58 XP_001561242 40S ribosomal protein S28 Botryotin 19 4267 349 0.56 20 7074 342 0.55 XP_569366 PRCDNA38 Cryptoco 21 7350 336 0.54 AAC69196 40S ribosomal protein S8 Schizoph: 22 6814 336 0.54 Q92196 ATP synthase delta chain, mitochondrial precursor Agaricus 23 1791 330 0.53 AAY85811 40S ribosomal protein S27 Chaetotm 24 1750 3}S q^ XP—572717 60S ribosomal protein L19 Cryptoco

86 Table 3.4 (continued) 25 2467 313 0.51 XP_001607900 PREDICTED: similar to phosphatidylethanolamine-binding protein Nasonia� 26 7246 310 0.50 XP_001248452 40S ribosomal protein S15 Coccidich 27 11858 303 0.49 EAU93093 40S ribosomal protein SI 1 Coprinop 28 3305 302 0.49 ZP_01112629 Protein containing QXW lectin repeats Reinekea 29 3070 293 0.47 30 2622 292 0.47 31 3623 291 0.47 32 9822 282 0.46 EAU86619 40S ribosomal protein S14 Coprinop 33 9523 273 0.44 XP_568154 Gal4 DNA-binding enhancer protein 2 Cryptoco� 34 5233 256 0.41 35 3562 253 0.41 XP_001212923 60S ribosomal protein L20 Aspergilli 36 838 249 0.40 XP_566913 PRCDNA87 Cryptocoi 37 764 248 0.40 38 9297 245 0.40 AAP13580 Guanine nucleotide binding protein beta subunit Lentinula 39 5528 238 0.38 AAW48295 Pore-forming toxin-like protein Hfr-2 Triticum t 40 7518 237 0.38 ABB72849 Eukaryotic ADP/ATP carrier Cryptocoi 41 7980 226 0.36 42 11011 223 0.36 BAB84546 Hydrophobin-263 Pholiota t 43 11833 221 0.36 EAU92224 Ubiquitin Coprinop. 44 181 211 0.34 BAC78820 Ubiquitin-conjugating enzyme9 Coprinop: 45 2575 210 0.34 XP_571207 Structural constituent of ribosome Cryptococ 46 2031 207 0.33 XP—572711 40S ribosomal protein S7 Cryptococ 47 4957 196 0.32 XP—570222 40s ribosomal protein s6-b Cryptococ 48 10794 195 0.31 093931 40S ribosomal protein S26 Schizophy 49 12124 190 0.31 XP_567416 60s ribosomal protein 111 Cryptococ 50 6827 188 0.30 AAP20199 40S ribosomal protein S5 Pagrus nu 51 2002 0.30 NP—001049239 0s03g0192400 Qryza sati

87 Table 3.4 (continued) 52 9312 183 0.30 XP_571982 Glycine-rich RNA binding protein Cryptoco 53 7323 176 0.28 XP_571691 Ribosomal protein S2 Cryptoco 54 779 175 0.28 CAB85690 Mismatched base pair and cruciform DNA recognition protein Agaricus 55 7923 167 0.27 XP_570111 Acyl carrier protein (acp) Cryptoco 56 5551 159 0.26 XP_571764 LlOe protein Cryptoco 57 12034 156 0.25 XP—568023 Methylenetetrahydrofolate reductase (NADPH) Cryptoco 58 7309 154 0.25 NP_956002 Basic leucine zipper and W2 domains 1 Danio rei 59 3385 151 0.24 EAU81446 60S ribosomal protein L44 Coprinop 60 6108 150 0.24 XP_569713 Hydrogen-transporting ATP synthase Cryptoco 61 6714 146 0.24 62 65 144 0.23 XP_001217810 Alcohol oxidase Aspergill 63 7469 144 0.23 XP—220903 PREDICTED: similar to calcium binding and coiled-coil domain 2 Rattus no 64 7513 142 0.23 XP—568864 PRCDNA95 Cryptoco 65 10509 140 0.23 XP_001384926 ATP synthase FO sector subunit 4 Pichia sti 66 9938 139 0.22 CAG29170 Copper transporter Pleurotus 67 3431 139 0.22 XP_663583 40S ribosomal protein S17 Aspergill 68 1416 138 0.22 AAP36848 Tubulin-specific chaperone a Homo sai 69 8731 138 0.22 AAL05426 Hydrophobin Tricholon 70 12181 136 0.22 XP_571929 Cytochrome-c oxidase Cryptoco� 71 11852 136 0.22 XP—001266325 NADH-ubiquinone oxidoreductase Neosartoi 72 6412 136 0.22 XP_001259287 Hsp90 binding co-chaperone (Sbal), putative Neosartoi 73 10837 136 0.22 74 1979 135 0.22 XP_001270561 LYR family protein Aspergilli 75 5232 135 0.22 76 2369 134 0.22 XP_001259479 Ran-specific GTPase-activating protein 1, putative Neosarto) 77 8322 130 0.21 EAU89607 60S ribosomal protein L27a Coprinop. 78 4043 ^ XP—752461 Mago nashi domain protein Aspergilli

88 Table 3.4 (continued) 79 2461 130 0.21 XP—001273801 Extracellular conserved serine-rich protein Aspergill 80 358 129 0.21 XP_001333057 PREDICTED: similar to Coiled-coil domain containing 12 Danio re 81 2029 129 0.21 XP—001355984 NADH-ubiquinone oxidoreductase subunit B 17.2 Ajellomyt 82 9300 127 0.20 NP_033102 Ribosomal protein L12 Mus musi 83 2126 127 0.20 EAU84422 DNA damage checkpoint protein rad24 Coprinop 84 4206 126 0.20 85 2051 125 0.20 XP_569121 60s ribosomal protein 119, mitochondrial precursor Cryptoco 86 7129 125 0.20 XP_001260759 Cytochrome c oxidase polypeptide vib Neosarto 87 750 122 0.20 XP_569162 Ribosomal LIO protein Cryptoco 88 2175 122 0.20 XP_001385720 ATP synthase d subunit Pichia sti 89 2064 121 0.20 XP_647188 Proteasome subunit beta type 6 Dictyoste 90 1128 117 0.19 XP_001275518 ATP synthase oligomycin sensitivity conferral protein, putative Aspergill 91 9564 116 0.19 92 6390 113 0.18 XP_566503 Prohibitin PHBl Cryptoco 93 624 112 0.18 XP—567077 Chaperone Cryptoco 94 435 112 0.18 AAZ14910 SEC61 Coprinell 95 6350 111 0.18 XP—001271227 60S ribosomal protein L22, putative Aspergill 96 307 110 0.18 BAA95482 Glia maturation factor beta Cyprinus 97 1078 106 0.17 Q4WDD7 E3 ubiquitin-protein ligase brel Aspergill 98 6392 105 0.17 AAP13582 Ras-related protein Rab7 Lentinula 99 2537 105 0.17 XP—567099 Ribosomal protein L35 Cryptoco 100 10462 104 0.17 1ITP_A Chain A, Solution Structure Of Poial Pleurotus 101 5612 103 0.17 XP_001274195 Vacuolar ATPase proteolipid subunit c, putative Aspergilh 102 7354 100 0.16 XP_001240474 60S ribosomal protein L33 Coccidioi 103 8255 99 0.16 XP—001267852 Mitochondrial ATP synthase epsilon chain domain-containing protein Aspergilh 104 7282 99 0.16 XP_754260 Extracellular serine-rich protein, putative Aspergilh 105 6095 ^ 0.16 XP—568690 Vesicle-mediated transport-related protein Cryptocot

89 Table 3.4 (continued)

106 2081 98 0.16 CAC35202 Endochitinase Amanita 107 11010 97 0.16 AAL05426 Hydrophobin Tricholoi 108 11340 93 0.15 XP_569415 Ribosomal protein Cryptoco 109 7519 93 0.15 XP_572422 DEAD box family helicase Cryptoco 110 436 92 0.15 CAL52492 Conserved alpha-helical protein (ISS) Ostreoco 111 5084 91 0.15 P62792 Histone H4 Phaneroi 112 5012 91 0.15 XP—567107 60s ribosomal protein 127 Cryptoco 113 11656 91 0.15 ABB96268 Hesp-178 Melamps 114 2499 91 0.15 115 366 87 0.14 XP_569252 Structural constituent of ribosome Cryptoco 116 3693 87 0.14 EAU89413 Histone H2B.1 Coprinop 117 5252 87 0.14 XP_567555 Transcriptional regulatory protein Cryptoco 118 3451 87 0.14 XP_572014 Clathrin assembly protein AP47 Cryptoco 119 9024 86 0.14 120 5430 86 0.14 XP—001276081 Zinc finger protein, putative Aspergill 121 6014 86 0.14 EAU80990 Rhol protein Coprinop 122 12499 85 0.14 123 2327 85 0.14 AAR91505 Ribosomal protein L29 Tetraoda 124 7510 85 0.14 XP_568309 Ribosomal protein L13 Cryptoco 125 467 85 0.14 XP_001267301 37S ribosomal protein Rsm25 Neosarto. 126 10445 84 0.14 AAQ16628 Manganese superoxide dismutase Taiwanoj 127 7936 83 0.13 AAH86909 Ribosomal protein L32 Mus muse 128 3725 82 0.13 129 2177 82 0.13 XP—566824 Defender against cell death 1 (dad-1) Cryptoco� 130 7036 81 0.13 131 1494 81 0.13 XP_758500 40S ribosomal protein S9 (S7) Ustilago i 132 3292 0J3 XP—569562 Structural constituent of ribosome Cryptocoi

90 Table 3.5 (Continued) 133 11081 81 0.13 XP_(X) 1274089 Conserved fungal protein Aspergil 134 7769 80 0.13 XP_568281 Sterol metabolism-related protein Cryptocc 135 9891 79 0.13 XP_570752 Nitochondrial fission-related protein Cryptocc 136 8960 79 0.13 CAD48751 XynllC Chaetom 137 11918 78 0.13 138 2646 78 0.13 XP_570032 Protein kinase regulator Cryptocc 139 5628 77 0.12 XP_001275305 Actin cytoskeleton protein (VIP 1),putative Aspergil 140 10029 77 0.12 XP_759980 GTP-binding protein Yptl Ustilago 141 8203 77 0.12 XP_001271678 60S ribosomal protein L38, putative Aspergil 142 223 76 0.12 XP_571357 6,7-dimethyl-8-ribityllumazine synthase Cryptocc 143 4879 75 0.12 144 3441 75 0.12 CAA80880 Ribosomal protein Al Schizosa 145 2737 75 0.12 146 4094 75 0.12 147 9324 74 0.12 148 11089 73 0.12 149 4847 73 0.12 NP_193777 snPNP-B Arabidop 150 7458 73 ^

a GLEAN model number is in the form of Gene:Jan06m300_GLEAN_XXXXX during inquiry in the C. cinerea genome annotation;

b Occurrence of the gene model is given by the summation of the counts of tags found Ikb upstream.

c % abundance is calculated by dividing the occurrence of the gene by 61,964 (the total counts of tags mapped to putative 5’-UTR in

d The annotation is left blank if the gene had no homologous proteins or corresponded to unknown protein,

e An e-value of l.OOE-5 is used as confirmation of protein homology.

91 Dijferentially Expressed Genes Using a 2-tail p-value of 0.05 in the

Fisher Exact Test, a total of 1,054 differentially expressed genes were identified. This showed that out of the 6,736 genes identified, 15.6% showed differential expression between the mycelial and primordial stages. Among the 1,054 genes, 358 were preferentially expressed in the mycelial stage, 207 (58%) of them were assigned a protein homolog and 139 (38.8%) were assigned a GO terms (Figure 3.9). For the

primordial stage, 696 genes were preferentially expressed, with 561 (81%) of them

were assigned a protein homolog and 484 (69.5%) were assigned a GO terms (Figure

3.10). For the sake of convenience, genes with an occurrence of 0 at any stage were

assigned an occurrence of 0.5 so that the fold difference between the two stages

could be calculated. The differentially expressed genes in the mycelial and

primordial stages for which a protein homolog was assigned are summarized in Table

3.5 and 3.6 respectively. These genes are sorted in descending p-value from the

Fisher Exact Test.

Investigation into the genes differentially expressed in mycelium showed that

many of them did not find homologous proteins in other organisms, particularly for

the 40 most differentially expressed ones, in which only 10 of them found a protein

homolog. This means that many of these genes are actually understudied. Strikingly,

the 3 most differentially expressed genes corresponded to a mismatched base pair

and cruciform DNA recognition protein, which is an abundant eukaryotic nuclear

protein associated with chromatin. The hydrophobin CoHl also showed significant

differential expression in the mycelium. In fact, expression levels of a total of 22

hydrophobin genes were revealed, and 8 of them were differentially expressed in

mycelium compared with 4 in the primordium.

Although most of the differentially expressed genes in the primordial stage

shared homology in other organisms, the top 5 genes could not identify any

92 homologous counterparts. Investigation into the rest demonstrates an up-regulation of the whole protein synthesis machinery. Beginning from transcription, 31 proteins were homologs to genes encoding transcription factors or proteins with cofactor activities compared to only 7 in the mycelial stage. Among these 31 genes, 8 are those encoding RNA polymerase II transcription factor. Expression of structural constituent of ribosomes are also highly up-regulated in the primordium, with 74 ribosomal protein genes being preferentially expressed compared to only 2 in mycelium. Protein turnover may increase as revealed from the higher expression of

12 ubiquitin-related genes such as ubiquitin-protein ligase and ubiquitin-conj ugating enzymes and 5 proteasome subunit genes. 30 differentially expressed genes were those participate in intracellular protein transport, targeting gene products to various cellular components. Notably, the primordium has a much elevated demand of proteins in order to support various physiological processes (Kiies, U. 2000).

Moreover, 29 genes related to signal transduction involving in either cell surface receptor-linked or intracellular signaling cascade were preferentially expressed. This is also reflected in the 18 up-regulated genes encoding proteins with guanyl nucleotide binding activity (the G proteins), which are a family of proteins involved in second messenger cascades.

In addition, differentially expression as consequences of depletion of nutrients and availability of light/dark cycles are crucial for fruiting body initiation and development. In this respect, 3 genes related to response to nutrient levels and 2 genes associated with phototransduction were identified. Nevertheless, despite the

necessity of light for the transition of hyphal knot to stage 1 primordium, no

differentially expressed genes are detected in relations to photoreceptor, and instead

the only photoreceptor identified was found to have similar expression between

mycelium and primordium. 93 Table 3.6 Summary of the 207 (out of 358) differentially expressed genes with protein homologues ii

GLEAN Occurrence'' Fold No.a Accession ID Annotation model'' Myc Pri difference''

1 9765 1951 28 86.84 XP_569409 Hmpl protein Cryptococcus neofi 2 767 1102 40 34.34 CAB85690 Mismatched base pair and cruciform DNA recognition protein Agaricus bisporus 3 779 3486 175 24.83 CAB85690 Mismatched base pair and cruciform DNA recognition protein Agaricus bisporus 5 9938 1029 139 9.23 CAG29170 Copper transporter Pleurotus sp. 'Flor 6 7677 537 1 669.27 CAA71652 CoHl Coprinopsis cinere 9 833 482 1 600.72 XP_001261781 Hemerythrin HHE cation binding domain protein Neosartorya fische 10 7676 477 2 297.24 CAA71652 CoHl Coprinopsis cinere 15 898 301 1 375.14 XP_956090 Conidiation-specific protein 6 Neurospora crassa 22 10955 160 1 199.41 CAD 10795 Putative phosphatidic acid phosphatase Pleurotus sp. Tlon 25 63 134 7 23.86 EAU83175 Serine/threonine-protein phosphatase PP2A-2 catalytic subunit Coprinopsis cinere 30 10888 88 4 27.42 EAU91281 40S ribosomal protein S18 Coprinopsis cinere 33 9958 99 10 12.34 XP_001264326 Fasciclin domain family Neosartorya fische, 34 2008 146 35 5.20 XP_751209 RNA binding protein, putative Aspergillus fumiga. 40 8431 69 3 28.67 XP_755660 Alcohol dehydrogenase, putative Aspergillus fumigai 41 1507 119 28 5.30 XP_001261730 Thioredoxin, putative Neosartorya fischei 42 7675 55 0 137.09 CAA74987 Hydrophobin Pleurotus ostreatus 43 3932 57 1 71.04 ZP_01352093 Glycosyl hydrolase, family 88 Clostridium phytofi 44 7750 60 2 37.39 Q9UVF8 Thiazole biosynthetic enzyme, mitochondrial precursor Sclerotinia sclerotu 47 416 45 0 112.17 XP—570275 Oxidoreductase Cryptococcus neofc 50 12063 42 1 52.34 XP—568483 (R,R)-butanediol dehydrogenase Cryptococcus neofc 52 9795 40 1 49.85 XP—001275718 Hemerythrin HHE cation binding domain protein Neosartorya fische} 54 9976 50 5 12.46 EDN63827 Conserved protein Saccharomyces cer 58 9766 44 4 13.71 ABE89643 Aldehyde dehydrogenase Medicago tmncatui

94 Table 3.4 (continued) 60 5205 99 38 3.25 XP_567104 88 kDa immunoreactive mannoprotein MP88 Cryptococcus neof 62 5645 31 0 77.27 AAL84791 HSP30 Exophiala dermati 61 1623 31 0 77.27 NP_822258 Secreted protein Streptomyces aven 65 226 47 7 8.37 XP_572088 12 kda heat shock protein (glucose and lipid-regulated protein) Cryptococcus neof 66 1670 27 0 67.30 YP_001105365 Secreted protein Saccharopolyspore 67 341 40 5 9.97 XP_001269398 Ketoreductase, putative Aspergillus clavati 68 899 26 0 64.81 XP_568304 DNA unwinding-related protein Cryptococcus neof 69 3814 49 10 6.11 YP—412388 Phosphoesterase, PA-phosphatase related Nitrosospira multij 71 874 25 0 62.32 NP_ 193091 Triacylglycerol lipase Arabidopsis thaliai 73 7751 43 8 6.70 NP_080276 Proteasome 26S non-ATPase subunit 9 Mus muscuius 74 6785 35 4 10.91 XP_755596 Integral membrane protein Aspergillus fumiga 75 5285 46 11 5.21 XP_001604847 PREDICTED: similar to PASG Nasonia vitripennu 76 8558 33 4 10.28 XP_746759 Phosphatidylserine decarboxylase family protein Aspergillus fumiga 77 2968 25 1 31.16 AAM78595 Small heat shock protein Laccaria bicolor 78 2513 43 10 5.36 ZP_01521948 ThiJ/PfpI Comamonas esto 79 6750 21 0 52.34 CAL46260 Putative mitochondrial inner membrane protein 1 Botryotinia fuckelii 80 3858 49 15 4.07 XP_001270808 Glutathione S-transferase, putative Aspergillus clavatu 82 7562 19 0 47.36 XP_001384300 NADPH-dependent alcohol dehydrogenase Pichia stipitis CBS 84 9570 51 18 3.53 AAX51843 Glutamate decarboxylase Paxillus involutus 85 10462 149 104 1.79 JC7625 Proteinase A inhibitor 1 Pleurotus ostreatus 86 1209 37 9 5.12 BAD14303 Flap endonuclease-1 Coprinopsis cinere� 88 6355 30 5 7.48 XP—001272574 DnaJ chaperone (Cajl), putative Aspergillus clavatu 90 10887 17 0 42.37 EAU91281 40S ribosomal protein S18 Coprinopsis cinere, 94 972 19 1 23.68 XP_001273228 Sodium/calcium exchanger protein Aspergillus clavatu 96 1021 16 0 39.88 XP_572595 Thiamine biosynthetic bifunctional enzyme Cryptococcus neoft 97 9606 27 5 6.73 XP_570444 Dihydrolipoyllysine-residue acetyltransferase Cryptococcus neoft 100 7678 18 1 22.43 CAA71653 CoH2 Coprinopsis cinere,

95 Table 3.4 (continued) 103 9796 15 0 37.39 XP—001267572 HHE domain protein Neosartorya fische 105 5720 24 4 7.48 BAB84545 Hydrophobin-251 Pholiota nameko 109 8279 19 2 11.84 XP_001246353 Glutathione S-transferase Coccidioides immii 110 2093 21 3 8.72 NP—001056513 0s05g0595200 Oryza sativa (japoi 113 2521 29 8 4.52 XP_001218186 DNA damage checkpoint protein rad24 Aspergillus terreus 114 3578 13 0 32.40 XP_001265527 FAD binding domain protein Neosartorya fische 117 10069 13 0 32.40 XP_571893 Nucleus protein Cryptococcus neofi 115 3844 13 0 32.40 AAU09714 YEL023C Saccharomyces cer 118 7055 20 3 8.31 XP_001257472 GPI anchored protein, putative Neosartorya fische. 119 1528 48 23 2.60 AAD43253 Peptide methionine sulfoxide reductase Gracilaria gracilis 120 6948 23 5 5.73 ABM91452 Conserved protein EasG Neotyphodium lolii 125 11733 15 1 18.69 AAA21525 Meiotin-1 Lilium longiflorum 124 1466 15 1 18.69 EAU92358 Mitogen-activated protein kinase styl Coprinopsis cinere� 127 11907 17 2 10.59 NP_983585 ACR183Cp Ashbya gossypii Al 126 3124 17 2 10.59 XP_001266201 HypA-like protein,putative Neosartorya fischei 128 7196 12 0 29.91 YP_110854 Oxidoreductase, short chain dehydrogenase/reductase family Aspergillus clavatu 130 932 22 5 5.48 NP—984760 AELlOlCp Ashbya gossypii Al 132 1124 39 17 2.86 XP_717494 aminopeptidase Candida albicans S 135 6634 11 0 27.42 XP_001267951 Zinc-binding oxidoreductase ToxD, putative Aspergillus clavatu 137 10455 23 6 4.78 XP_001269875 Aspartyl-tRNA synthetase, cytoplasmic Aspergillus clavatu 138 7766 13 1 16.20 XP_001270808 Glutathione S-transferase, putative Aspergillus clavatu. 139 4142 23 7 4.10 AAH16739 TERF2IP protein Homo sapiens 145 3988 10 0 24.93 XP—567886 Aryl-alcohol dehydrogenase Cryptococcus neofc 140 1213 10 0 24.93 ABE01888 Family 614/534 cytochrome P450 Phanerochaete chr) 144 3856 10 0 24.93 XP—001258070 Fungal cellulose binding domain protein Neosartorya fischer 143 3258 10 0 24.93 AATl 1911 Immunomodulatory protein Antrodia camphora 142 2809 10 0 24.93 CAJ88058 Putative large multifunctional secreted protein Streptomyces ambo,

96 Table 3.6 (Continued) 147 10723 28 11 3.17 CAD 10797 Putative cyclophilin Pleurotus sp. 'Flor 149 10623 12 1 14.96 XP_747075 Metalloreductase, putative Aspergillus fumiga 151 6912 39 20 2.43 XP_571709 Rho small monomeric GTPase Cryptococcus neofi 158 7888 9 0 22.43 XP_001273120 GNAT family acetyltransferase, putative Aspergillus clavatu 154 3236 9 0 22.43 BAB84545 Hydrophobin-251 Pholiota nameko 161 11118 9 0 22.43 XP_423398 PREDICTED: similar to Cingulin Gallus gallus 156 4368 9 0 22.43 XP_571334 Protein farnesyltransferase Cryptococcus neofi 162 9977 11 1 13.71 XP_571027 Protein-nucleus import-related protein Cryptococcus neofi 164 9292 44 26 2.11 ZP_00345738 COG2814: Arabinose efflux permease Nostoc punctiforme 167 5347 14 3 5.82 EAL85378 Serine-threonine rich protein, putative Aspergillus fumiga. 168 864 8 0 19.94 XP_956090 Conidiation-specific protein 6 Neurospora crassa 172 10778 8 0 19.94 NP_694818 Eukaryotic translation initiation factor 2C, 2 Mus musculus 173 11092 8 0 19.94 NP—015171 Pep4p Saccharomyces cer 174 3633 21 8 3.27 XP—566856 U6 snRNA-associated sm-like protein lsm8 Cryptococcus neofi 175 6733 10 1 12.46 ABM21576 CrpH Nostoc sp. ATCC 5 176 8506 10 1 12.46 YP—661053 Protein of unknown function DUF1486 Pseudoalteromonai 177 5628 93 77 1.51 XP—001275305 Actin cytoskeleton protein (VIP 1),putative Aspergillus clavatu 179 5055 12 2 7.48 XP—566548 C-14 sterol reductase Cryptococcus neofc 180 5235 12 2 7.48 XP—001268328 Dienelactone hydrolase family protein Aspergillus clavatu. 178 3034 12 2 7.48 XP—572884 Lactoylglutathione lyase Cryptococcus neofc 181 6302 28 14 2.49 XP_570172 6-phosphogluconolactonase Cryptococcus neofc 183 5955 13 3 5.40 XP_001264238 Hsp70 family protein Neosartorya fische? 184 10364 13 3 5.40 XP—750123 MFS transporter, putative Aspergillus fumigat 191 8253 7 0 17.45 XP_568025 Exo-beta-l,3-glucanase Cryptococcus neofc 186 1067 7 0 17.45 XP_571419 Formate dehydrogenase Cryptococcus neofo 192 11351 7 0 17.45 YP_563077 Glyoxalase/bleomycin resistance protein/dioxygenase Shewanella denitrif • 4500 7 0 17.45 ABG79371 Man5C Phanerochaete chn

97 Table 3.4 (continued) 193 11520 7 0 17.45 XP_692678 PREDICTED: similar to LOC553316 protein, partial Danio rerio 197 4733 9 1 11.22 AAL05426 Hydrophobin Tricholoma terreur 198 7167 9 1 11.22 ZP_01422708 Twin-arginine translocation pathway signal Caulobacter sp. K3 200 2321 14 4 4.36 XP_568318 Formaldehyde dehydrogenase (glutathione) Cryptococcus neofi 202 4473 12 3 4.99 XP_570607 Co-chaperone Cryptococcus neofi 203 7474 27 15 2.24 ABD14580 Zonadhesin variant 6 Pongo pygmaeus 204 4175 23 11 2.61 XP_310406 ENSANGPOOOOOO18863 Anopheles gambiae 205 6128 30 18 2.08 XP_001257472 GPI anchored protein, putative Neosartorya fische‘ 206 4993 19 8 2.96 CAC03461 Putative chloroperoxidase Agaricus bisporus 207 885 13 4 4.05 NP一 196963 Isocitrate dehydrogenase (NADP+)/ oxidoreductase Arabidopsis thaliar 210 1098 6 0 14.96 XP_962121 BLI-3 PROTEIN Neurospora crassa 211 2020 6 0 14.96 XP—566450 Calcium ion transporter Cryptococcus neofi 208 816 6 0 14.96 XP_753962 DUF89 domain protein Aspergillus fumigai 213 3974 6 0 14.96 ZP_00106689 Probable biotin carboxylase Hahella chejuensis 217 8171 6 0 14.96 AAL06079 QDE2 protein Blumeria graminis 223 6788 8 1 9.97 XP_761532 GBA4.USTMA Guanine nucleotide-binding protein alpha-4 subunit Ustilago maydis 52 221 4282 8 1 9.97 XP_572087 Protoporphyrinogen oxidase Cryptococcus neoft 220 1776 8 1 9.97 CAH19236 Putative SAM-dependent methyltransferase Aspergillus niger 219 32 8 1 9.97 XP_723189 Putative zinc-binding dehydrogenase Candida albicans S 225 3930 23 12 2.39 CAA05528 PMP20 Schizosaccharomyc 227 3704 10 2 6.23 XP_001268476 Haloalkanoic acid dehalogenase, putative Aspergillus clavatu. 226 376 10 2 6.23 XP—001275688 Rna-dependent rna polymerase Aspergillus clavatu, 229 5870 19 9 2.63 ZP—00754960 COG 1335: Amidases related to nicotinamidase Vibrio cholerae 03 230 11389 27 16 2.10 XP_568281 Sterol metabolism-related protein Cryptococcus neofc 231 10510 14 5 3.49 XP_001260430 Cellulase, putative Neosartorya fischer 233 1131 20 10 2.49 ABG37636 Aspartyl protease Filobasidiella neofc 234 5812 23 13 2.21 CAF05868 Probable brtl protein Neurospora crassa

98 Table 3.4 (continued) 237 2378 9 2 5.61 XP_571517 5'-3' exoribonuclease Cryptococcus neof 236 678 9 2 5.61 ZP_00857706 Oxidoreductase Bradyrhizobium sp 240 5646 11 3 4.57 XP_572073 DNA polymerase epsilon pl2 subunit Cryptococcus neoj 242 11635 11 3 4.57 AAH44109 Pdcd6-prov protein Xenopus laevis 246 10570 7 1 8.72 XP—001248318 Protein-L-isoaspartate 0-methyltransferase Coccidioides immi 243 3941 7 1 8.72 AAW26483 SJCHGC01421 protein Schistosoma japon 244 7226 7 1 8.72 XP_001261298 Sugar transporter Neosartorya fische 249 1159 5 0 12.46 XP—749038 Alpha-1,2-mannosidase, putative Aspergillus fumiga 251 2863 5 0 12.46 XP_571668 Anthranilate synthase Cryptococcus neofi 262 7106 5 0 12.46 AAR17472 Cochliobolus heter 258 4516 5 0 12.46 NP—611703 CG4250-PA Drosophila melanc 247 900 5 0 12.46 CAD32177 D-lactate dehydrogenase Neisseria eningit 253 3322 5 0 12.46 XP—567033 Exocyst protein Cryptococcus neofi 268 9758 5 0 12.46 XP_754309 HHE domain protein Aspergillus fumiga 272 11735 5 0 12.46 CAA74987 Hydrophobin Pleurotus ostreatus 248 986 5 0 12.46 NP_001024044 K08H 10.2b Caenorhabditis ele, 270 10828 5 0 12.46 AAH83282 LOC553260 protein Danio rerio 261 6996 5 0 12.46 XP_001275918 NAD-binding Rossmann fold oxidoreductase family protein Aspergillus clavatu 252 2966 5 0 12.46 YP—873147 Peptidase S8 and S53, subtilisin, kexin, sedolisin Acidothermus cellu 259 5571 5 0 12.46 XP—970632 PREDICTED: similar to CG3884-PB, isoform B Tribolium castaneu 264 7878 5 0 12.46 XP—748571 PRO 1 protein Cryptococcus neofc 254 3956 5 0 12.46 CAD 12881 Putative C2H2 zinc finger protein Podospora anserim 255 4009 5 0 12.46 BAC67687 Putative laminarinase Phanerochaete chr 263 7503 5 0 12.46 XP—001386926 Related to membrane protein [MIPS] Neurospora crassa 250 2292 5 0 12.46 XP_001266947 t-SNARE VTIl Lodderomyces elon 275 9561 28 18 1.94 XP—567927 Nucleus protein Cryptococcus neofc -276_9956 15 7 2.67 XP_572689 Protein kinase Cryptococcus neofc

99 Table 3.6 (Continued)

278 2041 38 29 1.63 XP_569269 Iron-sulfur cluster assembly-related protein Cryptococcus neoj 279 1911 10 3 4.15 XP—001268131 Sugar transporter Aspergillus clavati 281 10013 14 6 2.91 XP_001263811 dsDNA-binding protein PDCD5, putative Neosartoryafische 284 12026 8 2 4.99 ZP_01461750 Pentachlorophenol 4-monooxygenase Stigmatella aurant 283 5821 8 2 4.99 XP—001275870 TAM domain methyltransferase, putative Aspergillus clavati 286 5178 12 5 2.99 NP_567342 Phosphatase activator Arabidopsis thalia 289 9293 24 16 1.87 BAC76768 DNA primase catalytic subunit Coprinopsis cinere 288 2109 24 16 1.87 XP_570307 Protein-binding protein Cryptococcus neoj 292 5629 10 4 3.12 YP_321342 Cyclopropane-fatty-acyl-phospholipid synthase Anabaena variabil 312 6067 4 0 9.97 XP_569225 AEO16780 membrane protein Cryptococcus neoj 299 1840 4 0 9.97 XP—001259221 Ankyrin repeat protein Neosartorya fische 294 539 4 0 9.97 XP—001272956 Cell polarity protein (Teal), putative Aspergillus clavati 295 817 4 0 9.97 1H5Q_A Chain A, Mannitol Dehydrogenase From Agaricus Bisporus 297 905 4 0 9.97 NP_744319 Chorismate mutase family protein Pseudomonas putic 326 11242 4 0 9.97 XP_001264742 Chromodomain helicase (Chdl), putative Neosartorya fische 298 1577 4 0 9.97 ABM91452 Conserved protein EasG Neotyphodium lolii 300 2088 4 0 9.97 XP_001264831 Eukaryotic translation initiation factor 3 subunit EifCf, putative Neosartorya fische 316 7215 4 0 9.97 XP—001355128 GA10875-PA Drosophila pseudo 317 7549 4 0 9.97 AAC48526 Gastric mucin Bus scrofa 310 5355 4 0 9.97 XP_001267343 G-patch DNA repair protein (Drtl 11), putative Neosartorya fische 323 9968 4 0 9.97 AAX51848 Hydrophobin Paxillus involutus 301 2179 4 0 9.97 XP—850761 Similar to isochorismatase domain containing 2 isoform 1 Canis familiaris 320 8503 4 0 9.97 XP—714273 Putative 26S proteasome regulatory particle subunit RpnlOp Candida albicans !i 319 8415 4 0 9.97 CAJ00405 Putative CyP450 monooxygenase Pleurotus sapidus 329 11794 4 0 9.97 CAJ00405 Putative CyP450 monooxygenase Pleurotus sapidus 313 6273 4 0 9.97 ZP—01136019 Putative two-component member protein Pseudoalteromona. 322 8697 4 0 9.97 NP—001012798 RNA binding motif protein 5 Gallus gallus

100 Table 3.4 (continued) 325 11204 4 0 9.97 ABL98208 Serine protease Hypsizygus marmc 327 11387 4 0 9.97 XP_001273832 ThiJ/PfpI family protein Aspergillus clavati 328 11626 4 0 9.97 XP_001273482 Tyrosinase central domain protein Aspergillus clavati 293 527 4 0 9.97 XP_001271289 Ubiquitin-conjugating enzyme, putative Aspergillus clavati 306 4656 4 0 9.97 XP_001265970 UPF0187 domain membrane protein Neosartorya fische 330 12461 4 0 9.97 YP_801693 Zinc-binding alcohol dehydrogenase Leptospira borgpe^ 331 4070 17 10 2.12 XP_001268322 HIT domain protein Aspergillus clavati 334 4144 9 3 3.74 CAH17527 Carotenoid ester lipase precursor Pleurotus sapidus 332 3014 9 3 3.74 Q6URB0 Cytochrome c peroxidase, mitochondrial precursor (CCP) Cryptococcus neof 333 3878 9 3 3.74 XP_749852 NACHT domain protein, putative Aspergillus fumiga 335 9029 18 11 2.04 EAT38763 RAS protein, putative Aedes aegypti 336 12162 11 5 2.74 XP—001247186 Cystathionine beta-synthase Coccidioides immh 338 7877 7 2 4.36 XP—754095 C6 transcription factor, putative Aspergillus fumiga 340 11313 7 2 4.36 XP_311747 ENSANGP00000014213 Anopheles gambiat 337 2538 7 2 4.36 AAM73879 Ste20-like kinase Don3 Ustilago maydis 353 9683 6 1 7.48 XP_001265139 Endo-1,3(4)-beta-glucanase, putative Neosartorya fische 358 12192 6 1 7.48 NP_009839 F-Box protein Saccharomyces cer 346 3567 6 1 7.48 AAQ24589 NADH-quinone oxidoreductase Gloeophyllum trab 357 10639 6 1 7.48 XP_748278 Oxidase, putative Aspergillus fumiga 348 4367 6 1 7.48 ABF51354 phosphatidylethanolamine binding protein isoform 2 Bombyx mori 345 2620 6 1 7.48 XP_001121015 PREDICTED: similar to futsch CG3064-PB Apis mellifera 354 9826 6 1 7.48 XP_566592 Vesicle-associated membrane protein 712 Cryptococcus neofc

a The number is not continuous as those without protein homologues are not included.

b GLEAN model number is in the form of Gene:Jan06m300_GLEAN_XXXXX during inquiry in the C. cinerea genome annotations

e Occurrence of the gene model is given by the summation of the counts of tags found Ikb upstream.

101 d Fold difference is calculated by dividing the % abundance in mycelium by that in primordium. e An e-value of l.OOE-5 is used as confirmation of protein homology. f A p-value of 0.05 in the Fisher Exact Test is used as confirmation of differential expression.

Table 3.7 Summary of the 561 (out of 696) differentially expressed genes with protein homologues i] GLEAN Occurrence' Fold No.a Accession ID Annotation model'' Myc Pri difference''

6 11011 0 223 357.86 BAB84546 Hydrophobin-263 Pholiota nameko 7 1327 55 374 5.46 ABD64675 CGL3 lectin Coprinopsis cinen 8 733 237 805 2.73 XP_571980 60S ribosomal protein L24 (L30) Cryptococcus neoj 9 5528 20 238 9.55 AAW48295 Pore-forming toxin-like protein Hfr-2 Triticum aestivum 10 8731 0 138 221.45 AAL05426 Hydrophobin Tricholoma terreu. 11 7469 1 144 115.54 XP_220903 PREDICTED: similar to calcium binding and coiled-coil domain 2 Rattus norvegicus 13 8487 587 1381 1.89 XP_362846 Ribitol kinase Magnaporthe grist 14 2775 129 471 2.93 XP_572112 40s ribosomal protein s23 Cryptococcus neoj 15 11010 0 97 155.66 AAL05426 Hydrophobin Tricholoma terreu. 16 3305 66 302 3.67 ZP_01112629 Protein containing QXW lectin repeats Reinekea sp. MED 17 2081 2 98 39.32 CAC35202 Endochitinase Amanita muscaria 18 8960 0 79 126.77 CAD48751 XynllC Chaetomium them 19 7754 312 764 1.96 CAD 10794 Putative ribosomal protein S19 Pleurotus sp. 'Flor 20 2646 0 78 125.17 XP—570032 Protein kinase regulator Cryptococcus neof 21 4957 35 196 4.49 XP—570222 40s ribosomal protein s6-b Cryptococcus neoj 22 2461 15 130 6.95 XP_001273801 Extracellular conserved serine-rich protein Aspergillus clavati

102 Table 3.4 (continued) 23 6814 100 336 2.70 Q92196 ATP synthase delta chain, mitochondrial precursor A^aricus bisporus 24 7518 57 237 3.34 ABB72849 Eukaryotic ADP/ATP carrier Cryptococcus neoj 25 7350 103 336 2.62 AAC69196 40S ribosomal protein S8 Schizophyllum con 26 11998 1 67 53.76 CAJ44440 Tropomyosin Dermanyssus galli 27 7309 28 154 4.41 NP_956002 Basic leucine zipper and W2 domains 1 Danio rerio 28 12034 30 156 4.17 XP_568023 Methylenetetrahydrofolate reductase (NADPH) Cryptococcus neoj 29 183 0 57 91.47 XP_567607 Membrane transporter Cryptococcus neoj 30 7282 11 99 7.22 XP_754260 Extracellular serine-rich protein, putative Aspergillus fumiga 31 5612 13 103 6.36 XP_001274195 Vacuolar ATPase proteolipid subunit c, putative Aspergillus clavati 32 397 0 53 85.05 CAD12833 Hydrophobin 2 Pleurotus sp. 'Flor 34 7923 39 167 3.44 AAQ73138 Acyl carrier protein 1 Chlamydomonas n 35 2064 21 121 4.62 NP_001029541 Proteasome (prosome, macropain) subunit, beta type, 6 Bos taurus 36 2029 24 129 4.31 XP_001536006 NADH-ubiquinone oxidoreductase subunit B 17.2 Ajellomyces capsm 37 8541 0 48 77.03 XP_001262810 Phosphatidylserine decarboxylase family protein Neosartorya fische 38 838 79 249 2.53 XP_566913 PRCDNA87 Cryptococcus neofi 39 5935 1 52 41.72 XP—001276358 Histidine acid phosphatase, putative Aspergillus clavatu 40 6344 137 360 2.11 NP_986927 AGR261Wp Ashbya gossypii Al 41 6412 29 136 3.76 XP_001259287 Hsp90 binding co-chaperone (Sbal), putative Neosartorya fische 43 2669 2 55 22.07 YP_322046 Aldo/keto reductase Anabaena variabili 45 10389 0 44 70.61 XP—001267590 Allantoate permease Neosartorya fische. 47 2051 26 125 3.86 XP_569121 60s ribosomal protein 119, mitochondrial precursor Cryptococcus neofi 48 7129 26 125 3.86 XP_001260759 Cytochrome c oxidase polypeptide vib Neosartorya fische. 49 1366 254 561 1.77 XP_570887 60s ribosomal protein 121-a Cryptococcus neofi 51 11858 112 303 2.17 EAU93093 40S ribosomal protein SI 1 Coprinopsis cinere^ 52 10855 3 57 15.25 NP_567769 Zinc finger (C2H2 type) family protein Arabidopsis thaliar 54 9822 102 282 2.22 EAU86619 40S ribosomal protein S14 Coprinopsis cinerei 55 8557 1 47 37.71 XP—567191 Phosphatidylserine decarboxylase Cryptococcus neofc

103 Table 3.4 (continued) 56 358 29 129 3.57 XP_001333057 PREDICTED: similar to Coiled-coil domain containing 12 Danio rerio 57 7166 0 39 62.58 XP_755771 Prenyl cysteine carboxyl methyltransferase, putative Aspergillus fumigc 58 2183 3 52 13.91 XP—001259421 CCCH zinc finger DNA binding protein Neosartorya fische 60 4843 0 37 59.38 XP_717928 Putative ubiquitin-protein ligase Candida albicans, 62 1375 8 67 6.72 XP_569379 Phosphopyruvate hydratase Cryptococcus neoj 63 223 11 76 5.54 XP_571357 6,7-dimethyl-8-ribityllumazine synthase Cryptococcus neoj 64 181 71 211 2.38 BAC78820 Ubiquitin-conjugating enzyme9 Coprinopsis cinen 67 6640 2 45 18.05 XP_001258880 3-beta hydroxysteroid dehydrogenase/isomerase family protein Neosartorya fische 70 2330 8 64 6.42 XP_001064982 Similar to small nuclear ribonucleoprotein polypeptide G Rattus norvegicus 71 3562 96 253 2.11 XP_001212923 60S ribosomal protein L20 Aspergillus terreui 72 7354 22 100 3.65 XP_001240474 60S ribosomal protein L33 Coccidioides immi 74 4043 35 130 2.98 XP_752461 Mago nashi domain protein Aspergillus fumiga 75 10819 9 65 5.79 NP_595234 Glucoamylase Schizo sac charomy 76 1078 25 106 3.40 Q4WDD7 E3 ubiquitin-protein ligase brel Aspergillus fumiga 78 4053 1 37 29.69 EAW96409 hCG23738, isoform CRA_b Homo sapiens 79 11856 5 52 8.34 BAA33979 IPP isomerase Xanthophyllomyce. 80 11852 40 136 2.73 XP_001266325 NADH-ubiquinone oxidoreductase Neosartorya fische 81 7246 131 310 1.90 XP_001248452 40S ribosomal protein S15 Coccidioides immi 82 11081 16 81 4.06 XP_001274089 Conserved fungal protein Aspergillus clavatu 83 9297 96 245 2.05 AAP13580 Guanine nucleotide binding protein beta subunit Lentinula edodes 84 839 8 60 6.02 AAH97886 MGCl 15667 protein Xenopus laevis 85 5853 13 72 4.44 XP_569876 Riboflavin synthase Cryptococcus neofi 86 5389 1 35 28.08 YP_434834 Endoglucanase C-terminal domain/subunit and related protein Hahella chejuensis 88 7519 22 93 3.39 XP—572422 DEAD box family helicase Cryptococcus neofc 89 7947 210 440 1.68 XP_569827 Structural constituent of ribosome Cryptococcus neofi 91 2467 137 313 1.83 XP—001607900 PREDICTED: similar to phosphatidylethanolamine-binding protein Nasonia vitripennu __2060 5 47 7.54 AAS38620 Similar toF55B11.3.p Caenorhabditis ele,

104 Table 3.7 (Continued) 93 10853 9 59 5.26 XP—001384727 Eukaryotic translation initiation factor 5A (eIF-5A) (eIF-4D) Pichia stipitis CBS i 95 1983 5 46 7.38 NP_010995 Vacuolar transporter chaperon (VTC) Saccharomyces cen 96 2581 7 51 5.85 XP_001257674 MIPC synthase subunit (SurA), putative Neosartorya fischer 98 2126 40 127 2.55 EAU84422 DNA damage checkpoint protein rad24 Coprinopsis cinerec 99 275 9 57 5.08 XP_752201 General amino acid permease (Agp2), putative Aspergillus fumigat 100 6827 71 188 2.12 AAP20199 40S ribosomal protein S5 Pagrus major 101 1684 6 48 6.42 XP—747457 DUF895 domain membrane protein Aspergillus fumigat 102 2002 70 185 2.12 NP_001049239 0s03g0192400 Oryza sativa (japon 103 2175 38 122 2.58 XP_001385720 ATP synthase d subunit Pichia stipitis CBS ( 104 7785 0 25 40.12 Q06100 Galectin-1 (Galectin I) (Cgl-I) Coprinopsis cinerec 106 3451 22 87 3.17 XP—572014 Clathrin assembly protein AP47 Cryptococcus neofo 107 88 4 40 8.02 XP_001385169 Guanylate kinase (GUKl) Pichia stipitis CBS ( 108 6780 7 49 5.62 XP—001260566 Enoyl-CoA hydratase/isomerase family protein Neosartorya fischer 110 4856 3 36 9.63 XP_566608 60s ribosomal protein 130-1 (132) Cryptococcus neofo 111 6802 6 45 6.02 CAE85525 Probable electron transfer flavoprotein alpha chain precursor Neurospora crassa 112 5468 424 766 1.45 XP_568367 Rab GTPase activator Cryptococcus neofo 113 6368 7 48 5.50 XP_569020 Pre-mRNA splicing factor Cryptococcus neofo 114 1114 8 51 5.12 NP_952665 CoA-binding protein Geobacter sulfurreo 115 7580 3 35 9.36 XP—570304 Prefoldin subunit Cryptococcus neofo 116 12381 2 32 12.84 NP_923845 Metalloprotease MEPl homolog Gloeobacter violace 117 11656 25 91 2.92 ABB96268 Hesp-178 Melampsora lini 118 7964 17 72 3.40 XP—001385113 NADH-ubiquinone oxidoreductase Pichia stipitis CBS ( 119 8093 16 70 3.51 NP_497891 RNP (RRM RNA binding domain) containing family (rnp-4) Caenorhabditis eleg 120 2537 32 105 2.63 XP—567099 Ribosomal protein L35 Cryptococcus neofo 121 1979 47 135 2.30 XP—001270561 LYR family protein Aspergillus clavatus 122 6390 36 113 2.52 XP_566503 Prohibitin PHB1 Cryptococcus neofo. 123 6796 3 34 9.09 XP_001275445 Ketoreductase, putative Aspergillus clavatus

105 Table 3.7 (Continued) 124 5252 24 87 2.91 XP_567555 Transcriptional regulatory protein Cryptococcus neofi 125 2575 88 210 1.91 XP一571207 Structural constituent of ribosome Cryptococcus neofi 126 6742 12 59 3.94 ABE57263 Septin ring protein Cryptococcus neofi 127 7513 52 142 2.19 BAA33368 Ribosomal protein S16 homolog Schizosaccharomyct 128 1416 50 138 2.21 AAP36848 Homo sapiens tubulin-specific chaperone a Homo sapiens 129 7617 5 39 6.26 AAS46754 Mitochondrial ribosome small subunit component RPS19 Pleurotus djamor 130 2440 3 33 8.83 XP_368728 Deoxyuridine 5'-triphosphate nucleotidohydrolase Magnaporthe grise 131 8203 20 77 3.09 XP_001271678 60S ribosomal protein L38, putative Aspergillus clavatu 132 8322 46 130 2.27 EAU89607 60S ribosomal protein L27a Coprinopsis cinere, 134 435 37 112 2.43 AAZ14910 Protein transport protein SEC61 Coprinellus dissem 136 8584 364 657 1.45 XP_569612 40S ribosomal protein S12 Cryptococcus neofc 137 6350 37 111 2.41 XP_001271227 60S ribosomal proteinL22, putative Aspergillus clavatu 138 11442 5 38 6.10 XP_566529 RVBl Cryptococcus neofc 139 10794 82 195 1.91 093931 40S ribosomal protein S26 Schizophyllum com 140 3431 52 139 2.14 XP_663583 40S ribosomal protein S17 Aspergillus nidulan 141 6295 14 61 3.50 XP—974790 PREDICTED: similar to CG7220-PA, isoform A Tribolium castaneu 143 624 38 112 2.36 XP—567077 Chaperone Cryptococcus neofc 144 7936 24 83 2.77 AAH86909 Ribosomal protein L32 Mus musculus 145 10885 187 372 1.60 EAU91281 40S ribosomal protein S18 Coprinopsis cinerei 146 2454 11 54 3.94 BAC77593 Alpha2 tubulin Coprinopsis cinerei 147 3693 26 87 2.68 EAU89413 Histone H2B.1 Coprinopsis cinerei 148 366 26 87 2.68 XP_569252 Structural constituent of ribosome Cryptococcus neofc 149 2559 1 25 20.06 XP_566837 Isocitrate dehydrogenase Cryptococcus neofc 150 2336 5 37 5.94 XP_001258892 CBF/NF-Y family transcription factor, putative Neosartorya fische) 152 7250 9 48 4.28 XP—001266797 cell differentiation protein Neosartorya fischei 154 5395 1 23 18.45 NP—741663 Y39B6A.34 Caenorhabditis elei 155 3292 24 81 2.71 XP_569562 Structural constituent of ribosome Cryptococcus neofc

106 Table 3.4 (continued) 156 391 5 36 5.78 XP—566752 Methionine adenosyltransferase Cryptococcus neof 157 6045 11 51 3.72 XP_001247741 NADH-ubiquinone oxidoreductase 51 kDa subunit Coccidioides immi 159 3724 0 18 28.89 XP—572221 ATP dependent RNA helicase Cryptococcus neof 160 4320 0 19 30.49 XP_001210577 Ubiquitin-protein ligase Cryptococcus neof 161 7769 24 80 2.67 XP—568281 Sterol metabolism-related protein Cryptococcus neof 162 10509 55 140 2.04 XP_001384926 ATP synthase FO sector subunit 4 Pichia stipitis CBS 163 6892 18 67 2.99 NP_001025949 Mak3 homolog Gallus gallus 165 12124 83 190 1.84 XP_567416 60s ribosomal protein 111 Cryptococcus neof 166 1072 2 26 10.43 AA073005 Voltage-gated chloride channel Cryptococcus neof� 167 190 4 32 6.42 ZP_01418021 Thioredoxin Caulobacter sp. K: 168 4197 1 22 17.65 NP—984808 AEL053Cp Ashbya gossypUA: 169 5877 1 22 17.65 EDL90861 rCG38713, isoform CRA_c Rattus norvegicus 171 1494 25 81 2.60 XP—758500 40S ribosomal protein S9 (S7) Ustilago maydis 51 172 2184 17 63 2.97 XP—572016 Rho GDP-dissociation inhibitor 1 Cryptococcus neofi 174 9323 0 17 27.28 XP_752223 N-terminal acetyltransferase 2 Aspergillus fumiga 175 7344 0 17 27.28 NP_001062766 0s09g0280600 Oryza sativa (japoi 176 9523 133 273 1.65 XP_568154 Gal4 DNA-binding enhancer protein 2 Cryptococcus neofi 177 436 31 92 2.38 CAL52492 Conserved alpha-helical protein (ISS) Ostreococcus tauri 178 8094 2 25 10.03 AAW24985 SJCHGC06573 protein Schistosoma japom 179 602 233 434 1.49 NP_740781 Ribosomal Protein, Large subunit family member (rpl-17) Caenorhabditis ele 180 6392 38 105 2.22 AAP13582 Ras-related protein Rab7 Lentinula edodes 182 6336 8 41 4.11 XP—001600399 Similar to Polymerase (RNA) 111 (DNA directed) polypeptide E Nasonia vitripennv 183 8752 8 41 4.11 XP—566860 Vacuolar ATP synthase subunit e Cryptococcus neofc 185 2050 3 28 7.49 XP—572901 Peptidase Cryptococcus neofc 186 8867 11 48 3.50 XP_624938 Similar to small nuclear ribonucleoprotein at 69D CGI0753-PA Apis mellifera 187 10689 0 16 25.68 XP_001272379 Acetamidase, putative Aspergillus clavatu 188 2980 0 16 25.68 XP—761532 GB A4.USTMA Guanine nucleotide-binding protein alpha-4 subunit Ustilago maydis 52

107 Table 3.4 (continued) 189 9514 2 24 9.63 AAZ20286 Ubiquitin-conjugating enzyme 1 Arachis hypogaea 190 6014 29 86 2.38 EAU80990 Rhol protein Coprinopsis cinere^ 191 4840 6 35 4.68 XP—967031 PREDICTED: similar to transcription factor INI Tribolium castaneu 192 3299 7 38 4.36 XP_569684 Proteasome subunit, beta type, 7 Cryptococcus neofc 193 1750 163 318 1.57 XP_572717 60S ribosomal protein L19 Cryptococcus neofc 194 9530 3 26 6.95 ABR88135 Trehalose phosphorylase Pleurotus pulmona, 195 2280 18 62 2.76 BAD11816 Putative S-phase specific ribosomal protein cyc07 Lentinula edodes 196 7074 179 342 1.53 XP_569366 Ribosomal protein large subunit L37 Cryptococcus neofc 198 7915 8 39 3.91 AAG12157 Rho3 GTPase Aspergillusfumigai 199 1108 0 15 24.07 CAA36735 DNA-directed RNA polymerase Arabidopsis thaliar, 200 5357 0 15 24.07 XP_001262660 Pyridine nucleotide-disulphide oxidoreductase, putative Neosartorya fische) 202 1521 5 32 5.14 XP_572446 NADH-ubiquinone oxidoreductase Cryptococcus neofc 203 7357 7 37 4.24 XP_001215819 Nicotinamide-nucleotide adenylyltransferase 1 Aspergillus terreus 204 9313 16 57 2.86 BAA33018 Peel Coprinopsis cinerec 206 10375 1 19 15.25 XP_001274128 DUFl479 domain protein Aspergillus clavatu. 207 7466 1 19 15.25 XP—756068 Serine peptidase, family S28, putative Aspergillus fumigat 208 11340 34 93 2.19 XP—569415 Ribosomal protein Cryptococcus neofc 209 3402 3 25 6.69 XP—571543 Glutathione S-transferase 6 Cryptococcus neofc 210 9613 8 38 3.81 XP—567980 20 kda nuclear cap binding protein (ncbp) Cryptococcus neofc 211 5551 71 159 1.80 XP_571764 Ribosomal protein LlOe Cryptococcus neofc 212 1812 5 30 4.81 NP—001058751 0s07g0114300 Oryza sativa (japon 213 3312 5 30 4.81 XP_567519 Signal recognition particle protein Cryptococcus neofo 214 12282 12 47 3.14 XP_569804 Voltage-dependent ion-selective channel Cryptococcus neofo 215 11221 5 31 4.97 NP_593972 HypotDNA-directed RNA polymerase II subunit Schizosaccharomyc 216 996 22 68 2.48 XP_570703 NADH-ubiquinone oxidoreductase 12 kda subunit Cryptococcus neofo 217 1049 2 22 8.83 EAL91670 Bax Inhibitor family protein Aspergillus fumigat 218 10352 0 14 22.47 XP_001258818 WSC domain protein Neosartorya fischer

108 Table 3.7 (Continued) 221 11583 24 72 2.41 XP—566724 Arp2/3 complex 16 kda subunit (pl6-arc) Cryptococcus neofi 222 1106 14 51 2.92 XP—755891 V-ATPase proteolipid subunit (Ppal), putative Aspergillus jumiga 223 3717 19 61 2.58 Q9HGX4 Histone H2A Agaricus bisporus 224 467 31 85 2.20 XP—001267301 37S ribosomal protein Rsm25 Neosartorya fische. 226 5296 3 24 6.42 XP_001271346 Cyclin-dependent protein kinase, putative Aspergillus clavatu 227 6652 3 24 6.42 XP—001276209 Stress response RCI peptide, putative Aspergillus clavatu 228 7928 3 24 6.42 XP_572020 Translation elongation factor Cryptococcus neofc 229 9536 3 24 6.42 XP—569767 Translation release factor Cryptococcus neofc 230 1059 8 37 3.71 AAT91252 Het-c2 protein Paxillus involutus 231 5367 4 27 5.42 XP_783367 Similar to DnaJ (Hsp40) homolog, subfamily C, member 9 Strongylocentrotus 232 1287 15 52 2.78 XP_567521 Ribosomal protein Cryptococcus neofc 233 2031 101 207 1.64 XP_572711 40S ribosomal protein S7 Cryptococcus neofc 234 5829 6 32 4.28 BAD97445 Exo-beta-l,3-glucanase Lentinula edodes 238 4872 11 43 3.14 XP_568084 Alanine-glyoxylate transaminase Cryptococcus neofi 239 2397 10 41 3.29 XP_566482 COPD-coated vesicle protein Cryptococcus neofi 240 12047 16 54 2.71 EAU90492 Ubiquitin Coprinopsis cinere� 241 4605 2 20 8.02 XP_572378 Ribosomal protein L4 Cryptococcus neofc 242 2111 9 38 3.39 P49741 Tubulin alpha-1A chain Schizophyllum com 243 5663 0 13 20.86 NP_495267 C29F5.1 Caenorhabditis ele, 244 2731 0 13 20.86 NP—924609 Metalloprotease MEPl homolog Gloeobacter violac 247 5012 35 91 2.09 XP_567107 60s ribosomal protein 127 Cryptococcus neofc 249 11770 15 51 2.73 AAT91258 GTPase Paxillus involutus 250 649 1 17 13.64 NP—001049204 OsOSgOl86800 Oryza sativa (japor 251 8676 7 33 3.78 XP_569674 Elongation factor 1-beta (ef-1-beta) Cryptococcus neofc 253 9513 6 31 4.15 XP_566632 Transcriptional elongation regulator Cryptococcus neofc 254 1791 180 330 1.47 AAY85811 40S ribosomal protein S27 Chaetomium globoi 255 8724 9 37 3.30 XP_568963 Nam9 protein, mitochondrial precursor Cryptococcus neofc

109 Table 3.4 (continued) 256 9300 56 127 1.82 NP_033102 Ribosomal protein LI2 Mus musculus 257 2625 2 19 7.62 XP_750614 Fe superoxide dismutase, putative Aspergillus fumiga 258 612 12 44 2.94 XP一566440 Proteolysis and peptidolysis-related protein Cryptococcus neof 260 4977 11 41 2.99 EAU89692 40S ribosomal protein S13 Coprinopsis cinere 261 8643 3 22 5.88 AAX51845 Rab-type small GTP-binding protein Paxillus involutus 264 5601 0 12 19.26 XP_001256245 PREDICTED: similar to mucin 5 Bos taurus 265 8107 0 12 19.26 YP_705633 Probable dimethylaniline monooxygenase (N-oxide-forming) Rhodococcus sp. R 266 4252 0 12 19.26 XP—567924 Protein N-terminal asparagine amidohydrolase Cryptococcus neof� 269 274 5 27 4.33 YP—001412549 tRNA modification GTPase TrmE Parvibaculum lava 270 2006 4 24 4.81 XP一569537 Acetyl/propionyl CoA carboxylase, beta subunit Cryptococcus neoft 271 2063 4 25 5.01 AAT69640 Putative proteasome subunit alpha type 3 Oryza sativa (japo) 212 2113 20 59 2.37 XP_001134590 LSM (like-Sm) domain-containing protein Dictyostelium disci 273 3125 1 16 12.84 YP_484201 Beta-lactamase Rhodopseudomona 274 10076 1 16 12.84 XP—569821 Phosphoribosyl-ATP diphosphatase Cryptococcus neofi 275 1190 1 16 12.84 XP_570894 Small nuclear ribonucleoprotein Cryptococcus neofi 276 11205 1 16 12.84 XP_572679 Splicing factor Cryptococcus neofi 277 3385 71 151 1.71 EAU81446 60S ribosomal protein L44 Coprinopsis cinere 278 5488 15 49 2.62 XP_572808 Trafficking-related protein Cryptococcus neoft 279 11995 25 69 2.21 XP—001274738 NADH-ubiquinone oxidoreductase 21 kDa subunit, putative Aspergillus clavatu 280 6910 14 47 2.69 AAZ30051 Carbonic anhydrase 2 Cryptococcus neofi 281 219 11 40 2.92 XP_001140748 PREDICTED: similar to Lysophospholipase isoform 2 Pan troglodytes 282 9699 2 18 7.22 AAL73237 ADP-ribosylation factor-like protein Coprinopsis cinere 283 7761 2 18 7.22 XP_569484 ER to Golgi transport-related protein Cryptococcus neofi 284 1887 2 18 7.22 EDN61482 Thioredoxin peroxidase Saccharomyces cere 287 65 68 144 1.70 XP—001217810 Alcohol oxidase Aspergillus terreus 290 370 3 21 5.62 AAI23911 LOC548742 protein Xenopus tropicalis 292 2532 6 29 3.88 XP—571022 NTL02AT1906 50S ribosomal protein L22 Cryptococcus neofi

110 Table 3.4 (continued) 293 11789 10 37 2.97 XP_001209505 Adenylate kinase cytosolic Aspergillus terreus 294 5225 10 38 3.05 XP_569770 Thioredoxin Cryptococcus neofi 295 5448 5 26 4.17 XP_568798 Short-chain dehydrogenase Cryptococcus neofi 296 12254 4 23 4.61 NP_982749 ABL198Cp Ashbya gossypii A j 297 6831 4 23 4.61 BAE81787 Alginate lyase Haliotis discus han 298 3238 4 23 4.61 XP_001260590 Pol II transcription elongation factor subunit Cdc73, putative Neosartorya fische 299 12382 12 42 2.81 XP_001070778 Similar to U6 snRNA-associated Sm-like protein LSm6 isoform 1 Rattus norvegicus 300 7316 0 11 17.65 NP_001080226 Apoptosis inhibitor 5 Xenopus laevis 301 5477 0 11 17.65 CAB88663 Argininosuccinate lyase Agaricus bisporus 302 9933 0 11 17.65 ZP_01545767 Expressed protein Stappia ggregate 303 27 0 11 17.65 XP_572680 Nuclear mRNA splicing protein Cryptococcus neofi 304 2265 0 11 17.65 XP_001276085 Rheb small monomeric GTPase RhbA Aspergillus clavatu 305 7891 0 11 17.65 XP_567179 Ribosomal large subunit assembly and maintenance-related protein Cryptococcus neofi 306 10399 0 11 17.65 Q5BGN7 Structural constituent of ribosome Cryptococcus neofc 307 519 0 11 17.65 XP—568307 Tricarboxylate carrier Cryptococcus neofi 311 9165 1 15 12.04 XP_569702 Aminomethyltransferase, mitochondrial precursor Cryptococcus neofi 312 2112 1 15 12.04 XP—001267288 DNA methyltransferase 1-associated protein DM API Neosartorya fischei 313 11521 1 15 12.04 XP_001358649 GA21173-PA Drosophila pseudo. 314 1348 1 15 12.04 NP—010864 Mitochondrial ribosomal protein L2 Saccharomyces cer 315 8732 1 15 12.04 XP_573025 SNAP receptor Cryptococcus neofc 316 8082 1 15 12.04 CAM21006 WD repeat domain 5 Mus musculus 318 2704 8 32 3.21 NP—999574 Succinate-CoA ligase, GDP-formng, alpha subunit Sus scrofa 320 7510 35 85 1.95 XP—568309 Ribosomal protein LI3 Cryptococcus neofc 321 6619 7 30 3.44 NP_001012622 Macrophage erythroblast attacher Callus gallus 322 7060 13 43 2.65 XP_001247866 Ubiquitin-conjugating enzyme Coccidioides immit 323 5430 36 86 1.92 XP_001276081 Zinc finger protein, putative Aspergillus clavatu. _ZZ^ 10 36 2.89 ZP_01461315 Alkyl hydroperoxide reductase/ Thiol specific antioxidant Stigmatella aurantu

111 Table 3.4 (continued)

325 529 2 17 6.82 XP_570711 Cytoplasm protein Cryptococcus neoj 326 9466 2 17 6.82 AAB53686 Temperature sensitive supressor of beml/bud5 Saccharomyces ce, 327 6630 26 68 2.10 NP_ 182291 NADH-ubiquinone oxidoreductase-related Arabidopsis thalian, 328 1715 5 25 4.01 XP_567043 Succinate-CoA ligase (ADP-forming) Cryptococcus neoj 329 1421 24 64 2.14 ZP_01654112 Ribosomal protein LI Thermosipho mela 330 12014 12 40 2.67 XP—569077 DNA-directed RNA polymerases i, ii, and iii 8.3 kda polypeptide Cryptococcus neof 331 10363 3 20 5.35 XP_569643 Dephospho-CoA kinase Cryptococcus neof 332 1805 3 20 5.35 P0C1J7 FK506-binding protein 5 (Peptidyl-prolyl cis-trans isomerase) Rhizopus oryzae 333 6159 9 34 3.03 XP_001267224 CobW domain protein Neosartorya fische 334 8409 4 22 4.41 XP—001240328 DNA-directed RNA polymerase II Coccidioides immii 335 1458 4 22 4.41 XP—001272089 Prefoldin subunit 1, putative Aspergillus clavatu 337 750 57 122 1.72 XP_569162 Ribosomal LI0 protein Cryptococcus neofi 338 3441 30 75 2.01 CAA80880 Ribosomal protein Al Schizosaccharomyc 339 10047 11 38 2.77 CAD43407 Glomerella lindemi 342 557 1 13 10.43 XP_572666 DNA-directed RNA polymerases I, II, and III polypeptide Cryptococcus neofc 343 12145 1 13 10.43 XP—001270096 Phosphatidyl synthase Aspergillus clavatu 344 10080 1 13 10.43 XP_001272057 Phosphatidylinositol phospholipase C Aspergillus clavatu 345 1709 1 13 10.43 XP—570193 Protein transporter Cryptococcus neofc 346 2539 1 13 10.43 XP_717481 Putative H/ACA snoRNP component Candida albicans S 347 8342 1 13 10.43 XP—001259289 RNA annealing protein Yral, putative Neosartorya fische� 348 368 10 35 2.81 XP_001262863 BET3 family protein Neosartorya fischei 349 272 29 72 1.99 XP_568917 SAR small monomeric GTPase Cryptococcus neofc 350 6554 1 14 11.23 XP—754163 Extracellular dioxygenase, putative Aspergillus fumigai 351 7021 22 59 2.15 XP_722087 Putative mitochondrial ribosomal protein S7 Candida albicans S 352 10075 12 39 2.61 AAZ22511 Pho23p Saccharomyces cer� 353 8650 6 26 3.48 NP—984745 AEL116Cp Ashbya gossypii AT _5248 6 26 3.48 XP—001271438 Oxidoreductase Aspergillus clavatu.

112 Table 3.4 (continued) 355 10490 6 26 3.48 XP_001086617 PREDICTED: similar to SEC 11-like 3 isoform 2 Macaca mulatta 357 4939 0 10 16.05 NP_986552 AGL115Wp Ashbya gossypii A 358 3763 0 10 16.05 BAD11817 Cytochrome P450 Lentinula edodes 359 11664 0 10 16.05 CAB85700 Cytochrome P450 Agaricus bisporus 360 12344 0 10 16.05 BAB59027 Cytochrome P450 Trametes versicolc 361 3951 0 10 16.05 XP_569532 Cytokinesis-related protein Cryptococcus neoj 362 2245 0 10 16.05 XP_001271871 DUFl 682 domain protein Aspergillus clavati 363 9341 0 10 16.05 AAF05616 dynactin subunit p27 Mus musculus 364 5931 0 10 16.05 XP_001267915 Dynamin family protein Aspergillus clavati 365 2300 0 10 16.05 XP_568739 Glucan 1,3 beta-glucosidase protein Cryptococcus neoj 366 3026 0 10 16.05 XP_755596 Integral membrane protein Aspergillus fumiga 367 6487 0 10 16.05 XP_001270398 MFS transporter, putative Aspergillus clavati 368 8556 0 10 16.05 XP_001262810 Phosphatidylserine decarboxylase family protein Neosartorya fische 369 7652 0 10 16.05 AAP13582 Ras-related protein Rab7 Lentinula edodes 370 1062 0 10 16.05 XP—567975 SET domain protein Cryptococcus neof 371 2800 0 10 16.05 XP_001273337 Short chain dehydrogenase/reductase family oxidoreductase Aspergillus clavati 372 5857 0 10 16.05 XP_571302 UDP-N-acetylglucosamine diphosphorylase Cryptococcus neof 382 4847 30 73 1.95 NP_ 193777 SMB Arabidopsis thalim 383 8463 14 43 2.46 XP_001209532 60S ribosomal protein LI8 Aspergillus terreus 384 2217 16 47 2.36 XP—571814 Structural constituent of ribosome Cryptococcus neofi 385 2841 2 16 6.42 NP一983099 ABRlSlWp Ashbya gossypii Al 386 5129 2 16 6.42 XP_567471 Transport-related protein Cryptococcus neofi 389 12198 5 24 3.85 XP_966894 Similar to polymerase (RNA) II (DNA directed) polypeptide D Tribolium castaneu 390 2463 4 21 4.21 NP_832246 Enoyl-CoA hydratase Bacillus cere us AT 391 9526 4 21 4.21 XP—001137138 Similar to Malignant T cell amplified sequence 1 isoform 1 Pan troglodytes 392 5595 4 21 4.21 XP_566778 Vacuolar ATP synthase subunit d Cryptococcus neofi 393 7952 11 37 2.70 XP—566591 Actin ii (centractin-like protein) Cryptococcus neofc

113 Table 3.7 (Continued) 394 2611 3 19 5.08 XP_001258009 Assimilatory sulfite reductase Neosartorya fischt 395 2140 3 19 5.08 YP_001093614 FAD linked oxidase domain protein Shewanella loihia 398 11122 19 52 2.20 XP—568012 Membrane fraction protein Cryptococcus neoj 399 3710 13 41 2.53 AAK82369 Manganese superoxide dismutase Phanerochaete ck 401 9988 15 45 2.41 Q9AT63 Probable pyridoxal biosynthesis protein PDXl (Sor-like protein) Ginkgo biloba 402 6179 24 62 2.07 XP_571345 Ketol-acid reductoisomerase Cryptococcus neoj 404 2422 7 28 3.21 XP_572449 50s ribosomal protein 119 Cryptococcus neoj 405 4971 16 46 2.31 XP_570189 Aldo-keto reductase Cryptococcus neoj 406 2327 38 85 1.79 AAR91505 Ribosomal protein L29 Tetraodon fluviatu 407 7547 1 12 9.63 XP_001208895 cAMP-independent regulatory protein pac2 Aspergillus terreui 408 515 1 12 9.63 XP_569298 Cytosolic large ribosomal subunit protein Cryptococcus neoj 409 1207 1 12 9.63 XP_568633 Hydrolase Cryptococcus neoj 410 2431 1 12 9.63 XP_567632 Regulation of meiosis-related protein Cryptococcus neoj 411 10063 1 12 9.63 XP—814003 Ubiquitin-conjugating enzyme E2 Trypanosoma cruz 412 2243 1 12 9.63 CAA63316 Ubiquitin-protein ligase; ubiquitin-conjugating-protein Agaricus bisporus 416 2236 11 35 2.55 XP_568453 Cytoplasm protein Cryptococcus neoj 417 990 13 39 2.41 NP_013167 Spc3p Saccharomyces ce. 418 5594 5 23 3.69 XP_568655 Endoplasmic reticulum protein Cryptococcus neoj 419 6298 5 23 3.69 XP_001213178 Iron sulfur assembly protein 1 Aspergillus terrem 420 4204 8 29 2.91 BAF45335 Fatty acid desaturase Coprinopsis cinere 421 1386 8 29 2.91 CAA58778 Salicylate 1-monooxygenase Pseudomonas putu 422 12322 4 20 4.01 BAB59027 Cytochrome P450 Coriolus versicolo 423 4625 4 20 4.01 XP—567791 Glutamyl-tRNA synthetase Cryptococcus neoj 424 7861 4 20 4.01 XP_568458 Long-chain fatty acid transporter Cryptococcus neoj 426 210 2 15 6.02 XP_001212090 Casein kinase I isoform gamma-1 Aspergillus terrem 427 7955 2 15 6.02 XP_001375079 PREDICTED: similar to WD repeat domain 85 Monodelphis dome 428 304 2 15 6.02 XP—571849 snRNP subunit Cryptococcus neof

114 Table 3.7 (Continued) 429 11245 2 15 6.02 BAC78619 Subtilisin-like serine protease Coprinopsis cinen 430 457 2 15 6.02 XP_001351949 Subunit of proteaseome activator complex, putative Plasmodium falcip 432 10784 28 67 1.92 ABB96277 Hesp-767 Melampsora lini 433 2614 3 17 4.55 BAE47513 Putative cyclic AMP-dependent protein kinase regulatory subunit Schizophyllum cor 436 2864 10 33 2.65 XP_566757 Heat shock protein Cryptococcus neoj 437 7943 10 33 2.65 XP_977935 Similar to ubiquitin-like, containing PHD and RING finger domains, Mus musculus 438 8577 0 9 14.44 XP_001268092 Cellular morphogenesis protein (Bud22), putative Aspergillus clavati 439 5378 0 9 14.44 XP_569303 Dolichyl-diphosphooligosaccharide-protein glycotransferase Cryptococcus neoj 440 4075 0 9 14.44 XP_001259907 MFS transporter, putative Neosartorya fischi 441 10328 0 9 14.44 XP—001262810 Phosphatidylserine decarboxylase family protein Neosartorya fische 442 5390 0 9 14.44 XP_001274057 Poly(ADP)-ribose polymerase PARP, putative Aspergillus clavati 443 923 0 9 14.44 XP—970724 Similar to DnaJ homolog subfamily A member 2 Tribolium castanei 444 12464 0 9 14.44 YP_001058595 Putative lipoprotein Burkholderia pseu 445 9799 0 9 14.44 XP_001259014 Short-chain oxidoreductase, putative Neosartorya fische 450 11358 3 18 4.81 YP—323499 Peptidase S33, raline iminopeptidase 1 Anabaena variabil 451 3439 3 18 4.81 XP_001275915 Thioesterase family protein Aspergillus clavati 453 9312 98 183 1.50 XP—571982 Glycine-rich RNA binding protein Cryptococcus neoj 454 6803 14 41 2.35 XP_570422 Carbamoyl-phosphate synthase (glutamine-hydrolyzing) Cryptococcus neoj 455 307 53 110 1.67 BAA95482 Glia maturation factor beta Cyprinus carpio 456 9622 9 31 2.76 XP_001092786 ATP synthase mitochondrial F1 complex assembly factor 2 isoform Macaca mulatta 457 10049 9 31 2.76 XP—572890 Transcription initiation protein spt4 Cryptococcus neoj 458 10886 6 24 3.21 EAU91281 40S ribosomal protein S18 Coprinopsis cinen 459 4894 6 24 3.21 XP—001352627 GA13435-PA Drosophila pseudc 460 6783 6 24 3.21 NP_659142 Solute carrier family 35, member C2 Mus musculus 461 2739 15 42 2.25 XP—572797 Clathrin light chain Cryptococcus neoj 462 619 11 34 2.48 XP_568541 Malate dehydrogenase (oxaloacetate-decarboxylating) Cryptococcus neof ^__ 13 38 2.35 XP—757129 Ubiquitin-conjugating enzyme E2-16 kDa Ustilago maydis 5:

115 Table 3.4 (continued) 464 8623 5 21 3.37 XP_567003 Pre-mRNA splicing factor RNA helicase PRP28 Cryptococcus neoj 465 5020 8 28 2.81 CAL51577 U4/U6-associated splicing factor PRP4 (ISS) [ Ostreococcus taur 466 8212 29 67 1.85 EAU86802 60S ribosomal protein L23 Coprinopsis cinere 467 5402 5 22 3.53 YP_033835 508 ribosomal protein L3 Bartonella hensela 468 1992 1 11 8.83 XP_753590 1,4-alpha-glucan branching enzyme Aspergillus fumiga 469 4772 1 11 8.83 XP_001258675 Aminopeptidase Y, putative Neosartorya fische 470 4925 1 11 8.83 NP_034494 General transcription factor IIH, polypeptide 4 Mus musculus 471 10285 1 11 8.83 XP_001260881 Histone-like transcription factor (CBF/NF-Y), putative Neosartorya fische 472 6083 1 11 8.83 XP—567686 Mitochondrion protein Cryptococcus neoj 473 5567 1 11 8.83 XP_001269354 Nuclear pore complex protein (SonA), putative Aspergillus clavati 474 11303 1 11 8.83 CAE05499 OSJNBa0022H21.19 Oryza sativa (japo‘ 475 1959 1 11 8.83 XP—001261319 PCI domain protein Neosartorya fische 476 563 1 11 8.83 AAZ14931 Putative nucleoside-diphosphate-sugar epimerase Coprinellus dissem 477 5421 1 11 8.83 NP_588227 Replication factor A complex protein 2 Schizosaccharomyu 478 12503 1 11 8.83 XP—451301 UCRQ—KLULA Kluyveromyces lac 481 206 10 32 2.57 XP_569347 Vacuolar ATP synthase subunit d Cryptococcus neofi 483 1828 24 58 1.94 XP_001268042 Ribosomal protein L33, putative Aspergillus clavatu 484 1774 7 25 2.87 AAP43506 FK506-binding protein FKBP12 Schizophyllum com 485 2092 7 25 2.87 XP—001265273 Prefoldin subunit 2, putative Neosartorya fische 487 3829 3 16 4.28 AAN74825 MPUlp Gibberella monilifc 488 1529 3 16 4.28 AAF63681 orf5 protein Mus musculus 489 9827 3 16 4.28 XP_001269671 Translocon protein Sec61beta, putative Aspergillus clavatu 490 10244 3 16 4.28 P79008 Tubulin beta chain (Beta tubulin) Coprinopsis cinere, 493 2345 2 14 5.62 XP—752352 37S ribosomal protein SI 1, putative Aspergillus fumiga� 494 5655 2 14 5.62 XP_570283 ATP phosphoribosyltransferase Cryptococcus neofc 495 9557 2 14 5.62 CAJ42002 Glycogen synthase kinase Ustilago hordei 496 11297 2 14 5.62 XP—571569 Plasma membrane fusion-related protein Cryptococcus neofi

116 Table 3.4 (continued) 497 11863 2 14 5.62 XP_571972 Ribose-phosphate diphosphokinase Cryptococcus neoj 499 11376 9 29 2.59 XP_001217370 Ubiquinone biosynthesis protein C0Q4, mitochondrial precursor Aspergillus terreui 500 5052 6 23 3.08 XP_566973 Fatty acid desaturase Cryptococcus neoj 501 7854 18 46 2.05 XP一753739 NADH-ubiquinone oxidoreductase 9.5 kDa subunit, putative Aspergillus fumigo 502 4203 0 7 11.23 NP_563695 ADKl (DUAL SPECIFICITY KINASE 1); kinase Arabidopsis thalia 503 1788 0 7 11.23 XP_001264987 CBS and PBl domain protein Neosartorya fische 504 11094 0 7 11.23 XP_572594 Cell division control protein Cryptococcus neoj 505 10033 0 7 11.23 XP_571066 Cyclin-dependent protein kinase regulator Cryptococcus neoj 506 8307 0 7 11.23 XP_569433 Cystathionine gamma-synthase Cryptococcus neoj 507 12335 0 7 11.23 BAF37047 DNA polymerase lambda Coprinopsis cinere 508 5742 0 7 11.23 AAA35313 Heat shock transcription factor Pichia stipitis 509 5494 0 7 11.23 AA025531 Kex2 Cryptococcus neof 510 5360 0 7 11.23 XP_572187 Mitotic spindle assembly -related protein Cryptococcus neof 511 7320 0 7 11.23 XP_752233 NADH dehydrogenase (ubiquinone) 29/2 IK chain precursor Aspergillus fumiga 512 7476 0 7 11.23 XP_001268283 NADH-ubiquinone oxidoreductase 213 kDa subunit Aspergillus clavati 513 98 0 7 11.23 XP_570985 Nuclear membrane protein Cryptococcus neof 514 9677 0 7 11.23 XP—750734 Nuclear polyadenylated RNA-binding protein Nab2, putative Aspergillus fumiga 515 3757 0 7 11.23 BAA05461 PES4 PAB-like protein Saccharomyces cei 516 6195 0 7 11.23 YP_001058595 Putative lipoprotein Burkholderia psem 517 8771 0 7 11.23 ZP_01649899 Putative secreted protein Salinispora arenia 518 10040 0 7 11.23 XP—569426 rRNA binding protein Cryptococcus neofi 519 62 0 7 11.23 EAU83175 Serine/threonine-protein phosphatase PP2A-2 catalytic subunit Coprinopsis cinere 520 6486 0 7 11.23 AAS92554 SirD Leptosphaeria mac 521 2193 0 7 11.23 XP_569768 Tetrahydrofolylpolyglutamate synthase Cryptococcus neofi 522 12231 0 7 11.23 XP—748428 Tyrosinase, putative Aspergillus fumiga 523 1982 0 7 11.23 XP—572762 U1 small nuclear ribonucleoprotein Cryptococcus neofi 524 7918 0 7 11.23 XP—748133 UPF0047 domain protein Aspergillus fumiga.

117 Table 3.4 (continued) 525 3003 0 7 11.23 NP一938184 Zinc finger, CCHC domain containing 9 Danio rerio 532 1123 30 67 1.79 XP_572918 40S ribosomal protein SO Cryptococcus neoj 533 11315 16 42 2.11 XP_571819 Vesicle-mediated transport-related protein Cryptococcus neof 534 3136 0 8 12.84 XP_566786 C-5 sterol desaturase Cryptococcus neof 535 1229 0 8 12.84 XP_001263242 Carboxypeptidase Y, putative Neosartorya fische 536 2577 0 8 12.84 EDN62093 Conserved protein Saccharomyces cei 537 3024 0 8 12.84 XP_567743 Copper ion transporter Cryptococcus neof 538 8620 0 8 12.84 XP_572220 DNA repair-related protein Cryptococcus neof 539 12178 0 8 12.84 XP_568598 DNA replication factor Cryptococcus neof 540 8350 0 8 12.84 XP_001527691 DNA-directed RNA polymerases I and III 16 kDa polypeptide Lodderomyces elor 541 8267 0 8 12.84 NP_567724 FIB2 (FIBRILLARIN 2) Arabidopsis thalim 542 8811 0 8 12.84 XP_001246353 Glutathione S-transferase Coccidioides immL 543 3137 0 8 12.84 AAX81444 High nitrogen upregulated cytochrome P450 monooxygenase 2 Phanerochaete chr 544 1806 0 8 12.84 AAM15960 Histone deacetylase Hdal Ustilago maydis 545 8498 0 8 12.84 XP_572441 h-scol Cryptococcus neofi 546 2979 0 8 12.84 XP—747491 K+/H+ antiporter, putative Aspergillus fumiga 547 7734 0 8 12.84 Q9Y8B5 Mitochondrial-processing peptidase subunit beta (Beta-MPP) Lentinula edodes 548 6070 0 8 12.84 XP_001271454 Monocarboxylate permease, putative Aspergillus clavatu 549 8744 0 8 12.84 XP_567114 NADH dehydrogenase Cryptococcus neofi 550 7493 0 8 12.84 NP_001049484 0s03g0235100 Oryza sativa (japoi 551 2259 0 8 12.84 XP_001371603 PREDICTED: similar to baculoviral lAP repeat-containing 6 Monodelphis dome 552 3657 0 8 12.84 XP_974891 Similar to DnaJ (Hsp40) homolog, subfamily C, member 17 Tribolium castaneu 553 7894 0 8 12.84 ABE89053 Protein of unknown function DUF185 Medicago truncatu 554 4615 0 8 12.84 CAD 12881 Putative C2H2 zinc finger protein Podospora anserin 555 442 0 8 12.84 AAZ14925 Putative spindle checkpoint protein Coprinellus dissem 556 2994 0 8 12.84 CAG38357 Subtilisin-like protease Phanerochaete chr 557 10053 0 8 12.84 XP_001269391 Transcription elongation complex subunit (Cdc68) Aspergillus clavatu

118 Table 3.4 (continued) 558 8435 0 8 12.84 XP_569626 Transcription initiation factor TFIIE alpha subunit Cryptococcus neoj 559 6373 0 8 12.84 XP_754065 Transporter, putative Aspergillus fumiga 560 540 0 8 12.84 EAT48579 U2 small nuclear ribonucleoprotein, putative Aedes aegypti 561 842 0 8 12.84 XP_755284 Universal stress protein family domain protein Aspergillus fumiga 562 11917 0 8 12.84 XP_570855 UVSB PI-3 kinase Cryptococcus neoj 567 2295 8 27 2.71 NP_983979 ADLllVWp Ashbya gossypii A. 568 11857 14 38 2.18 XP_566648 Mitochondrion protein Cryptococcus neof 569 7826 19 48 2.03 BAE48267 Putative polyubiquitin Physcomitrella pal 570 7315 5 20 3.21 YP_372912 2-methylcitrate dehydratase Burkholderia sp. 3 571 6108 81 150 1.49 XP_569713 Hydrogen-transporting ATP synthase Cryptococcus neof 572 9742 20 49 1.97 XP_568755 Protein-vacuolar targeting-related protein Cryptococcus neof 573 4694 7 24 2.75 XP_001559519 3-hydroxyacyl-CoA dehydrogenase Botryotinia fuckelL 574 6348 7 24 2.75 JC4293 Fumarate hydratase (EC 4.2.1.2) precursor Rhizopus oryza 577 7478 4 18 3.61 AAC23703 Rahl Coprinus cinereus 578 5162 4 18 3.61 XP_752731 SH3 domain protein Aspergillus fumiga 579 2005 1 10 8.02 XP_568103 ARF GTPase activator Cryptococcus neofi 580 4975 1 10 8.02 XP_001270577 Condensin Aspergillus clavatu 581 3506 1 10 8.02 BAC75955 Dihydrofolate reductase Coprinopsis cine re 582 2822 1 10 8.02 Q8J0Q0 Mannosyl-oligosaccharide 1,2-alpha-mannosidase Candida albicans 583 5003 1 10 8.02 XP_001273582 MFS monocarboxylate transporter, putative Aspergillus clavatu 584 12137 1 10 8.02 ZP_01181490 NAD-dependent epimerase/dehydratase Bacillus cereus sub 585 8805 1 10 8.02 XP_001262431 Phthalate transporter, putative Neosartorya fische. 586 1193 1 10 8.02 XP_968343 PREDICTED: similar to CG5649-PA Tribolium castaneu 587 1727 1 10 8.02 XP_972790 PREDICTED: similar to CG7382-PA Tribolium castaneu 588 7715 1 10 8.02 XP_001378582 Similar to delta8-delta7 sterol isomerase related protein EBRP Monodelphis dome. 591 11993 3 15 4.01 XP_640282 Allantoinase Dictyostelium discc ^_9895 3 15 4.01 XP_572254 Cationxation antiporter Cryptococcus neofc

119 Table 3.4 (continued) 593 242 3 15 4.01 XP_001274164 D-arabinitol dehydrogenase ArbD, putative Aspergillus clavati 594 68 3 15 4.01 XP_001265344 F-actin capping protein beta subunit Neosartorya fische 595 8427 3 15 4.01 ABL89163 L-glutamine D-fructose 6-phosphate amidotansferase Volvariella volvaa 596 2226 3 15 4.01 Q2UJ16 Mediator of RNA polymerase II transcription subunit 17 Aspergillus oryzae 597 199 3 15 4.01 XP_569457 NEDD8 activating enzyme Cryptococcus neoj 598 11372 3 15 4.01 ABH11429 Peroxin IIC Penicillium chryso 599 8652 3 15 4.01 Q4WVS2 Protein bcpl Aspergillus fumiga 600 3447 3 15 4.01 CAB56626 Ribosomal protein 22 of the small subunit Xanthophyllomyce. 601 3297 3 15 4.01 ZP—01646021 Sl/Pl nuclease Stenotrophomonas 602 2451 3 15 4.01 ZP—01439972 Single-strand DNA binding protein Fulvimarina pelag 603 7710 3 15 4.01 AAV65512 Thioredoxin reductase Cryptococcus neof 604 9020 2 13 5.22 XP_566548 C-14 sterol reductase Cryptococcus neof 605 10532 2 13 5.22 XP_001261178 Cell division control protein Cdc48 Neosartorya fische 606 6740 2 13 5.22 XP_566916 Glutamine synthetase Cryptococcus neofi 607 10776 2 13 5.22 P46076 Neutral protease 2 precursor (Deuterolysin) Aspergillus oryzae 608 3302 2 13 5.22 XP_791175 PREDICTED: similar to Histone deacetylase 8 Strongylocentrotus 609 9701 2 13 5.22 XP—718645 Putative U3 snoRNP protein Candida albicans !i 610 9614 2 13 5.22 AAX30393 SJCHGC03251 protein Schistosoma japon, 612 2315 25 57 1.83 EAU84630 ADP-ribosylation factor 1 Coprinopsis cinere 613 3449 6 22 2.94 YP—422386 Ferredoxin Magnetospirillum r 614 2755 8 26 2.61 XP—572156 Cytoplasm protein Cryptococcus neofi 615 1794 10 29 2.33 XP_001384381 Vacuolar H-ATPase 14 kDa subunit (subunit F) Pichia stipitis CBS 616 967 10 30 2.41 YP_357447 Predicted phosphatase Pelobacter carbino 617 418 10 30 2.41 AAB01771 Thioredoxin homolog Naegleriafowleri 618 7323 99 176 1.43 XP_571691 Ribosomal protein S2 Cryptococcus neofc 619 365 5 19 3.05 EAU88800 GTP-binding nuclear protein spil Coprinopsis cinere. 620 3505 5 19 3.05 NP_001043620 0s01g0624500 Oryza sativa (japor

120 Table 3.4 (continued) 621 3979 5 19 3.05 XP_001365208 PREDICTED: similar to peptidyl-prolyl cis-trans isomerase E Monodelphis dorm 622 2986 5 19 3.05 XP_568706 Ribonuclease H Cryptococcus neoj 624 2070 18 44 1.96 XP_001260649 Transcriptional corepressor Cyc8, putative Neosartorya fischi 625 4854 7 23 2.64 XP_721436 Phosphomannomutase Candida albicans 626 125 9 27 2.41 XP—001271135 Arsenite translocating ATPase ArsA, putative Aspergillus clavati 627 4066 0.5 6 9.63 BAB69078 Aft3-1 Altemaria altema 628 9109 0.5 6 9.63 AAC72747 Aryl-alcohol oxidase precursor Pleurotus eryngii 629 8138 0.5 6 9.63 Q0UMB9 ATP-dependent RNA helicase DBP4 Phaeosphaeria no� 630 95 0.5 6 9.63 NP—012078 Batlp Saccharomyces ce 631 118 0.5 6 9.63 XP—OO1263605 Clp protease, putative Neosartorya fische 632 3991 0.5 6 9.63 YP_466910 Cystathionine gamma-synthase Anaeromyxobacte) 633 6604 0.5 6 9.63 AAQ84022 Cytochrome P450 monooxygenase pc-bph Phanerochaete chi 634 8618 0.5 6 9.63 XP_001269813 DNA primase large subunit Aspergillus clavati 635 12206 0.5 6 9.63 XP_001261926 ELMO/CED-12 family protein Neosartorya fische 636 593 0.5 6 9.63 XP_568336 Endopeptidase Cryptococcus neoj 637 10071 0.5 6 9.63 XP_322016 ENSANGPOOOOOO12048 Anopheles gambia 638 10258 0.5 6 9.63 ZP_01461172 F5/8 type C domain protein Stigmatella aurant 639 8108 0.5 6 9.63 XP_752531 Flavin-containing monooxygenase Aspergillus fumiga 640 8149 0.5 6 9.63 XP_748218 General amidase-B Aspergillus fumiga 641 8276 0.5 6 9.63 NP—213161 GMP synthase Aquifex aeolicus V 642 6621 0.5 6 9.63 XP_001271061 Haemolysin-III channel protein Izh2, putative Aspergillus clavati 643 1048 0.5 6 9.63 CAJ81276 Haloacid dehalogenase-like hydrolase domain containing 1A Xenopus tropicalis 644 11579 0.5 6 9.63 AAX81444 High nitrogen upregulated cytochrome P450 monooxygenase 2 Phanerochaete cht 645 1270 0.5 6 9.63 AAU94648 Metalloprotease Pleurotus ostreatu. 646 1138 0.5 6 9.63 CAD35288 Metalloprotease, MEP3 Microsporum cani. 647 5826 0.5 6 9.63 XP_571156 Mitochondrial processing peptidase Cryptococcus neof 648 2679 0.5 6 9.63 XP_001257 849 mRNA splicing factor (Prpl7), putative Neosartorya fische

121 Table 3.4 (continued) 649 3706 0.5 6 9.63 NP_719703 NAD dependent epimerase/dehydratase family protein Shewanella oneidt 650 120 0.5 6 9.63 AAY33179 OBPa Candida parapsilc 651 5976 0.5 6 9.63 NP_001046415 0s02g0244300 Oryza sativa (japo 652 10595 0.5 6 9.63 XP—570861 Pheromone maturation-related protein Cryptococcus neoj 653 2872 0.5 6 9.63 AAH05601 Phosducin Mus musculus 654 7083 0.5 6 9.63 YP_387503 Phosphoribosylaminoimidazole-succinocarboxamide synthase Desulfovibrio desi 655 2752 0.5 6 9.63 XP_533523 PREDICTED: similar to CGI 1596-PA’ isoform A Canis familiaris 656 7840 0.5 6 9.63 XP—001366738 PREDICTED: similar to neurexin 3a alpha Monodelphis domi 657 8869 0.5 6 9.63 XP_711765 Putative GPI-protein transamidase complex subunit Candida albicans • 658 2709 0.5 6 9.63 094122 Pyruvate kinase (PK) Agaricus bisporus 659 1422 0.5 6 9.63 CAE76162 Related to snRNP-associated protein Neurospora crassc 660 10349 0.5 6 9.63 CAJ41994 Ribosomal protein L30 Ustilago hordei 661 9912 0.5 6 9.63 NP_566574 S-adenosylmethionine-dependent methyltransferase/ catalytic Arabidopsis thalia 662 975 0.5 6 9.63 XP_572707 Thiamine pyrophosphokinase Cryptococcus neoj 663 2010 0.5 6 9.63 XP—001270029 TOM complex component Torn?, putative Aspergillus clavati 664 4477 0.5 6 9.63 XP_001384188 Zinc finger protein Pichia stipitis CBS 674 11875 4 17 3.41 XP—566718 DNA repair-related protein Cryptococcus neoj 675 5614 4 17 3.41 BAB62528 MFBC Lentinula edodes 676 2470 4 17 3.41 XP—566818 Protein-vacuolar targeting-related protein Cryptococcus neoj 677 5627 4 17 3.41 XP_566996 Urease accessory protein ureG Cryptococcus neoj 680 994 17 42 1.98 XP_570501 Actin binding protein Cryptococcus neoj 682 12221 3 14 3.74 XP—571107 Arp2/3 complex 34 kda subunit (p34-arc) Cryptococcus neof 683 357 3 14 3.74 XP—568136 Eukaryotic translation initiation factor 2B subunit 2 Cryptococcus neoj 684 2462 3 14 3.74 XP_569630 Nicotinate-nucleotide diphosphorylase (carboxylating) Cryptococcus neof 685 9609 3 14 3.74 XP_001259165 Succinyl-CoA:3-ketoacid-coenzyme A transferase Neosartorya fische 686 7870 3 14 3.74 XP—001264533 Transcriptional elongation factor Iws 1, putative Neosartorya fische 687 4319 3 14 3.74 XP—566576 Ubiquinone biosynthesis methyltransferase Cryptococcus neofi

122 Table 3.7 (Continued) 689 11824 6 20 2.67 BAB87833 Leucine aminopeptidase Coprinopsis cinen 690 509 6 20 2.67 XP_001215350 Peptidyl-prolyl cis-trans isomerase cyp8 Aspergillus terrew 691 1385 6 20 2.67 XP_753795 SHOl protein Aspergillusfumigc 692 2688 6 21 2.81 XP_001275834 DUF866 domain protein Aspergillus clavati 693 3100 6 21 2.81 XP—569725 Mitochondrial 60s ribosomal protein 138 (yml38) Cryptococcus neoj 694 209 6 21 2.81 XP_001264673 Vesicle-mediated transport protein Vid24, putative Neosartorya fischt

a The number is not continuous as those without protein homologues are not included. b GLEAN model number is in the form of Gene:Jan06m300_GLEAN_XXXXX during inquiry in the C. cinerea genome annotation e Occurrence of the gene model is given by the summation of the counts of tags found Ikb upstream, d Fold difference is calculated by dividing the % abundance in primordium by that in mycelium, e An e-value of 1 .OOE-5 is used as confirmation of protein homology. f A p-value of 0.05 in the Fisher Exact Test is used as confirmation of differential expression.

123 File Search Options Took Table Plugans Help g^aaaaaiiiBiaiaas^aa^^^ j ^ Toggle disabled Appearance level: Representative v tD (139) GO Root

I Mapped only | | I聊re disabled | ^Wi^rData | 白| L J —J 白.u (7/139) MYC B 0 (70) binding | ^ ® R Jan06m300 GLEAN 00539 丨 & 0 _ n^^kic acad binding | 丨尹四(i^l, ceflaj^-componant 由艮 Jan06m300"GLEAN"01124 ® 0 (8) DNA binding j ;? ® ^ R mwmmmmtm ® i :® 0 � envelope ^ R Jan06ni300 GLEAN 03878 © 0 �transcription rtguktor activity | ;® 2 ;Jl:愈elljjl^matnx 由 ^ Jan06m300"GLEAN"04516 ^^BHHra ;® 0 a ) extc^eUular xe^^n 由 ^ Jan06m300"GLEAN'06273 i ;® Wmembrane-encbsed lumen 电 Jan06m300"GLEAN"07750 I i © 0 (78) organelk 一 一 ©••0 (11) protein complex | 0 (2) synapse 丨 S- n (121) moleculai_function 由 0 (3) antioxidant activity | I ®- 0 (70) bufidmg ; © 0 (83) catalytic activity ! ® 0 (6) enzyme regulator activity | B' fP (6) signal transducer activity i i …0 (1) structural mQlecuJe activity 丨 ; i 礼ft itfjm'AmAiuimimm.iBta (B 0 (4) translation, regulator activity © -0 (9) transporter activity | 白 D (105) biobgical_process | i ® 0 (89) cellular process ; ©-0 (15) development © 0 (4) growth I B-0 (4) interaction between organisms 由 0 (100) physiobgical process i © 0 (25) regulation of biobgical process | ® 0 (4) reproduction ! iB-0 (23) response to stimulus O (1) viral life cycle ;

! Ian06rfl3Q0_GLEAN_01466 II i

Figure 3.10 Gene Ontology (GO) annotations (http://www.geneontology.org/ GO.downloads.shtml) for 139 differentially expressed genes in the mycelial stage visualized by the freeware Gene Ontology Browsing Utility (GOBU) (http://gobu.iis.sinica.edu.tw). 139 genes out of 358 (38.8%) differentially expressed genes were assigned at least one GO

terms. Among the 139 genes. 111 were classified as cellular components, 121 with

molecular functions and 105 participated in biological processes. Notice that one

gene is not only assigned to a single category, even in a subtree. GLEAN_01466 is

used as an example to demonstrate the visualization using the program.

124 File Search Options Took Table Plwgins Help

Toggle disabled Appearance level: Representative v ^ (484) GO Root I Mapped only I [ Ignor. disabled ^^�llSi^S^S ^ tion | L- —J ‘ B IT (33/484) PRI 一 B 0 (245) buidmg I ® R Ja.^6m300 GLEAN 00199 :日 0 (108) protem bmdmg S B(420)ceUukr_component ® r Jan06m300"GLEANl00365 B 0 (12) transcription factor binding ;电 0(417) cell R Jan06m300~GLEAN 00593 白 0 (8) transcnption cofactor activity j ;R JanD6m300"GLEAN"00612 OBmHKa: 丨 ® 0(3)extoceUukfmatnx R Jaii06m30o"GLEAN"oi806 曰 0 (33) tiaiiscnption i^guJator activity i 甲 0(7)extraceMairegm r Jai^6m300'GLEAN"02060 B Q (8) transcnpfen cofactor activity | 丨 ® 0;S9) 'nembi«i^-enclosedluinen © r J«^6m300'GLEAN"02336 ••丨響1 I © 0 (365) orgajoelle r J8aiD6ni300 GLEAN 02431 曰 0 (8) transcnplionaJ activator activity j :®^0^O6) Pxotem complex R J«n06m300"GLEAN"03238 01叫丨丨•IMfT州•iW.yfPM e H(«8)j„olac«]^Juncbon 丨i. r Jan06m300_GLEAN_04053 i ;办 ^.RJa:^6m300"GLEAN"04615 I ! :^ , ® R JanD6m3003LEAN>925 ! 1 i ® 0C27)catal>^ activity R Ja^6m300 GLEAN 05248 ! i ;® 0 )chaperone:«gjJ.toractivity B- r Jai^6m300"GLEAN"05252 _ i ;,5 Oapei^yrne regulator activity © R _6m300�LEAN"0543 0 i i 丨丄 B R M06m300>LEAN>742 丨 丨 ;© oaO signal tr^^eracuvity | ^ R Jai^6m300 GLEAN 05877 j j ^ ^^KSS^SSS™ i ® R J-06m300^LEAN>870 j ;? 2 微计?^”**"〒!隱F,」翻 j Ef R Jaii06m300_GLEAN_07943 | j ;» 0 6)^khon:«gulator8ctivity 丨右 R Ja^6m3l]0 GLEAN 08082 I !

躍藏=办 I 丨 甲 0 425) ceUdax process 由 ^ Jan06m300"GLEAN"09313 i ! 母 0 development 由 & Ja:^6m300"GLEAN"09513 | | © eaS) growth I ® R Jaji06m300lGLEANl09523 i i ® 0^) mtB^tionbetweenorgainms | r j^6m300"GLEAN"09988 j i El 0 435) physiDbg^al process | ^ r Ja:J]6m300"GLEAN"l0033 | B Q a pigmentation | ® r Ja;J]6m300"GLEAN'l0049 - | I ^ 2 =仿巧址n of biobgu^al process | ^ r Ja^6m300"GLEAN"l0053 :: j © 0 04 reproduction j ,5, R Jaii06m300"GLEAN"l0855 ! i ——丄丄,, i L^^-BMoIa画二腿—丨 I Ja]i06m300_GLEAN_08094

Figure 3.11 Gene Ontology (GO) annotations (http://www.geneontology.org/ GO.downloads.shtml) for 484 differentially expressed genes in the primordial stage visualized by the freeware Gene Ontology Browsing Utility (GOBU) (http://gobu.iis.sinica.edu.tw). 484 genes out of 696 (69.5%) differentially expressed genes were assigned at least one GO terms. Among the 484 genes, 420 were classified as cellular components, 428 with molecular functions and 443 participated in biological processes. Notice that one gene is not only assigned to a single category, even in a subtree. GLEAN_08094 is used an example to demonstrate the visualization using the program

125 3.4 Discussion

3.4.1 Validation of 5,SAGE libraries

Molecular techniques such as northern blotting, DNA microarrays (Zhang and

Dietrich, 2005) and quantitative real-time PGR (Siu et al., 2001; Cimica et al., 2007) are commonly used to validate SAGE data. In this study, quantitative PGR (qPCR) was employed to assess the reliability and accuracy of the 5’ SAGE data.

As a housekeeping gene across different developmental stages of C. cinerea had not yet been reported, northern blotting was used to test for its presence. Three genes including Cc.G6PDH, Cc.Pma and Cc.Ras were tested, but they were not constantly expressed when C. cinerea grew from mycelium through primordium till mature fruiting body. Their expressions in mycelium were generally higher than those in primordium and thus could not be employed as control genes for qPCR.

qPCR is an effective means for mRNA quantitation for its high sensitivity and

specificity. mRNA with low expression level can be detected and only a small

amount of RNA is required. Due to its high sensitivity, contamination must be

avoided to ensure reliable results (Wong and Medrano, 2005). Nevertheless, melting

curve analysis could be performed to ensure the purity of the end products of the

real-time PGR assays.

Twelve genes having different relative expression level ratio were randomly

picked for assay to validate the 5' SAGE data. Normalization was done against one

of the genes (clathrin coat assembly protein) which showed constant expression

between mycelium and primordium in three different pairs of biological samples. It

is believed that constant expression in three independent samples should rule out the

possibility of bias in RNA amounts during reverse transcription and experimental

errors. Results of real-time PGR were highly consistent with the 5’ SAGE results,

126 and the relative expression level ratios were similar to the fold difference calculated from the 5’ SAGE data. This confirmed the reliability and the accuracy of the results from these two 5' SAGE libraries.

The literature was also searched for validation of the expression data.

Boulianne et al (2000) reported the expression pattern and roles of two fungal galectins Cgll (GLEAN model #07785) and Cgl2 (GLEAN model #07786) in C. cinerea, and observed that they were differentially regulated during fruiting body formation. Cgll is specifically expressed in primordium and mature fruiting body, while Cgll starts to express in early stages of fruiting body development (hyphal knot formation) and is maintained until maturation of the fruiting body. This agrees with the 5' SAGE data, as expression of both galectins were not detected during the mycelial stage and was detected in the primordial stage, with Cgll having a higher expression. Furthermore, Lee (C.Y.Y. Lee, personal communication) reported a 3 fold up-regulation of Cc. Rab7 using northern blotting between mycelium and primordium, which also agrees with the fold difference of 2.22 revealed in this study.

For the sake of future functional studies, while the expression of clathrin coat assembly protein (as well as other genes revealed to be constantly expressed in mycelium and stage 2 primordium) in other developmental stages remains to be investigated, it is necessary to continue to find a gene that expresses constantly in more developmental stages.

3.4.2 Analysis of highly and differentially expressed genes

At present, the C. cinerea genome is predicted to contain ~ 12,000-13,000

protein coding genes. Using this figure and the criteria in this study, the expression

level of a total of 6,736 distinct genes was revealed, which represents �50 %of the C.

127 cinerea transcriptome. While the expression of genes related to various physiological processes such as mating, meiosis and fruiting body maturation may not be detected and that some of the predicted proteins may not actually be expressed, it is believed that this study provided a good coverage of the genes expressed in the mycelial and primordial stages.

A protein homologous to mismatched base pair and cruciform DNA recognition protein was the most expressed genes in the mycelium, and was also the most preferentially expressed genes in this stage. This protein is an abundant eukaryotic

non-histone nuclear protein associated with chromatin, specifically recognizing

cruciform DNA structure (a non-double helix form of DNA) which is observed

during genetic recombination, and thus was originally believed to play a role in

recombination (Bianchi et al., 1989). However, further analyses revealed the general

feature of the protein to recognize cruciform DNA, help bending DNA and bind to

bent or kinked DNA, thereby serving as structural elements in inducing or

maintaining DNA architecture (Dutta et al., 1997). Einck and Bustin (1985)

suggested that DNA replication can generate branched DNA either directly or

through induced supercoiling, and it was observed that the protein can effectively

drive the equilibrium toward the formation of a DNA-protein complex. However,

whether this complex is for protection purpose until further nuclear activities or

serves specialized functions remains unknown. Based on these findings, one

speculation is that the cruciform DNA structure are induced as a result of the high

growth rate of the mycelium of C. cinerea and that in the primordium, cell division

rate declines such that the expression of the protein drops accordingly.

Intriguingly, the enzyme ribitol kinase was highly expressed in both mycelial

and primordial stages. The enzyme had been reported to actively participate in

biosynthesis of riboflavin in many micro-organisms, and that some of them produce

128 the vitamin in large excess (Bacher and Lingens, 1970). Although ribitol had not been supplemented to the culture medium in this study, it was proved that ribose or ribulose phosphate is an alternative source for riboflavin biosynthesis at a lower yield (Mehta et al., 1972). This agrees with the general fact that many edible mushrooms are an excellent source of riboflavin. On the other hand, ribosomal proteins also constituted a remarkable proportion of the most highly expressed genes in both developmental stages, but expression in the primordial stage was obviously higher. This implies an elevated demand of various proteins which is also reflected in an up-regulation of the whole protein synthesis machinery as seen in the primordial stage, where preferential expressions are observed for transcription factors, ubiquitin-related genes, proteasome-related genes and genes for intracellular protein transport.

Although many studies had been conducted to reveal the physiological

processes involved during fruiting body initiation and development of C. cinerea,

little is known about these processes at the molecular level. Comparing the gene

expression patterns from 5' SAGE libraries, 358 genes were down-regulated and 696

genes were up-regulated (p value < 0.05 in Fisher Exact Test) in the primordium

compared to the mycelium. This suggests a significant switching of the

transcriptomes between the two developmental stages, and investigation into these

differentially expressed genes may provide some insights on the possible molecular

events involved.

Signaling of light The light/dark cycle is the most important factor for initiation

and development of the fruiting body (Klies et al., 2000). Previous studies revealed

that hyphal knots are formed in the dark and are repressed by light, whereas light is

essential for subsequent development of the fruiting body (Ballou and Holton, 1985).

129 In respect of this, proteins involved in phototransduction are examined. A protein

(GLEAN_02872) showing high homology to phosducin (Pdc) found in a number of mammals was preferentially expressed in the primordium. Pdc expression is highly restricted in mammals, being found at significant levels only in the photoreceptor cells of the retina and the pineal gland (Willardson and Hewlett, 2007). This expression patterns suggest a specific role for Pdc in light signaling. In these cells,

Pdc is phosphorylated in the dark by cAMP-dependent protein kinase (PKA) and subsequently dephosphorylated upon exposure to light (Lee et al., 1984). Its phosphorylation in the dark is probably due to a high cAMP level and whereas dephosphorylation in the light is a result of reduced level of cAMR Interest in Pdc was greatly increased when it was co-purified from retinal extracts with the G protein beta-gamma subunit complex (GtPy). Based on these findings, a potential role of the protein was proposed in the down-regulation of the G protein phototransduction signaling pathway in the light and in vitro experiments had shown that dephosphorylated Pdc could block G protein signaling by disrupting the interaction between G protein a subunit and Gtpy (Bauer et al., 1992). This is, to certain extent, contradictory to our current understanding that light induces the expression of some genes such as the galectin Cgll (Boulianne et al.’ 2000).

However, the detected level of Pdc in primordium was not high, and presumably, the expression level of the protein in light and dark period may be different. Therefore, studying the transcript level of Pdc in primordium in light and dark periods may provide more information on this issue. In addition, the C. cinerea genome was also searched for Pfam Pdc domains and another homolog was identified, but its expression was not detected.

On the other hand, a homolog for Rho GTPase in Aspergillus, which also shown differential expression in primordium, had been identified (GLEAN_07915).

130 The activity of the enzyme as an effector is proposed to be under the control of

Ga-GTP and GtPy. However, other potential effector enzymes such as adenylyl cyclase, cGMP phosphodiesterase, phospholipase Cp, phosphatidylinositol-3-kinase did not show differential expression.

Nutrient Depletion Fruiting body initiation is a result of nutrient depletion in the culture medium (Kiies et al., 2000), thus implying that studying signaling pathways or molecules related to nutrient-sensing and starvation response may help understanding the molecular events involved. In a wide range of organisms from yeast to Drosophila to human, the target of rapamycin (TOR, also known as FRAP,

FKBPl 2-rapamycin-associated protein) is a central regulator of cell growth and metabolism (Martin and Hall, 2005). TOR is a serine/threonine protein kinase that controls a wide range of cellular events in response to different environmental cues such as stimulation by growth factors, changes in nutrient conditions and fluctuation in energy (Bai et al., 2007), and is specifically inhibited by rapamycin produced by bacteria through interaction with the FKBPl2 protein. In mammals, FKBP38, a member of the FK506-binding protein (FKBP) family which is structurally similar to FKBPl 2, binds to TOR and inhibits its activity in a dose dependent manner in a way similar to that of the FKBP 12-rapamycin complex.

Previous studies by Bai et al (2007) showed that FKBP38 is an endogenous inhibitor of TOR for its ability to bind TOR. During amino acids (nitrogen)

starvation, FKBP38 binds and interferes with TOR, thereby inhibiting cell growth.

However, it was found that Rheb, a Ras-like small GTPase, is able to prevent such

inhibition by binding to FKBP38 in a GTP-dependent manner. In turn, TOR is

activated and exhibits its kinase activity. From the 5' SAGE libraries, a 2.83 fold

up-regulation of a protein homolog of FKBPl2 in Schizophyllum commune

131 (GLEAN—01774) is observed, while an almost 18 fold up-regulation of the homolog of Rheb small monomeric GTPase in Aspergillus (GLEAN_02265) was also observed. This suggests that the mechanism of activation of TOR in response to nutrient availability by the Rheb GTPase is well-conserved in C. cinerea.

In addition, nutrient starvation is also responsible for the onset of sexual development including conjugation, meiosis and sporulation in fission yeast

(Tsukahara et al, 1998). During the primordial stage of C. cinerea, sexual commitment has started as observed from the formation of the gill and basidia, where basidiospores will be formed (Kuhad et al., 1987). In Schizosaccharomyces pombe, nitrogen-starvation invoked onset of sexual development requires the induction of the key transcription factor gene Stell, which is essential for the activation of a number of genes needed for the initiation and progression of conjugation and meiosis (Okazaki et al, 1998). To date, four main factors or pathways are identified that regulate the expression of Stell: (1) the cAMP-Pkal cascade, which is mediated mainly through a carbon source signal; (2) the mitogen-activated protein (MAP) kinase kinase-MAP kinase cascade, which transduces a stress signal; (3) Pac2; and (4) Rcdl.

As observed from the 5' SAGE libraries, the two copies of the cAMP-dependent protein kinase genes (GLEAN_00966 and GLEAN_09802) had very low expression level, thereby suggesting that sexual development in C. cinerea

may not be mediated by signal as a consequence of carbon depletion. Intriguingly,

all loci of the MAP kinase (the highest expression is a homolog of MAP kinase in

Cryptococcus neoformans, with an occurrence of 4) and MAP kinase kinase (the 2

loci did not express at all) in C. cinerea almost did not express in the primordial

stage, but notable expression was detected in the mycelium. This is critically

different from that observed in another basidiomycetes Lentinula edodes, in which

132 expression of MAP kinase was the highest in the primordium among all developmental stages (Szeto et al., 2007), and is also contradictory to what is observed in Schizosaccharomyces pombe, in which the MAP kinase-MAP kinase kinase cascade is essential for Stell induction in response to nutrient starvation

(Okazaki et al., 1998). This argues against the roles of MAPK and MAPKK during sexual development of C. cinerea, and we speculate that these two kinases may function predominantly during mating through pheromone-activated cascades, whereas they may not play a role in the onset of karyogamy and subsequent meiosis and sporulation processes.

The Rcdl gene was discovered about a decade ago (Okazaki et al., 1998). This leucine-rich factor, having no apparent motifs and significant homology to any other proteins with known functions, is highly conserved in sequence among many eukaryotes. Although the mechanism is unclear, the rcdl factor was shown to be crucial for Stell induction during nitrogen starvation in fission yeast, but itself is not a component of the general nitrogen signal cascade. In C. cinerea, the homolog for this rcdl factor (GLEAN_07250) was also identified, and is preferentially expressed with a 4 fold up-regulation in the primordium. Therefore, it is proposed that the rcdl homolog is also important for the onset of sexual development in C. cinerea, probably through induction of certain transcription factors and subsequent activation of other genes. The striking conservation of the rcdl gene throughout eukaryotes may also suggest the existence of an evolutionarily conserved differentiation controlling system.

In addition to the genes related to light signaling and nutrient depletion, some other genes were also suggested to be actively involved in fruiting body initiation and development. Liu et al. (2006) reported a gene known as the cfsl gene, which

133 shows high homology to the bacterial cyclopropane fatty acid synthases converting membrane-bound unsaturated fatty acids into cyclopropane fatty acids, to be essential for fruiting body initiation. Using the UV-mutated C. cinerea strain

AmutBmut, they found that a T-to-G transversion in the cfsl gene arrested the mushroom in the hyphal knot stage without further development. In bacteria, the physical properties of the cell membrane may be altered through the production of cyclopropane fatty acids by Cfsl, and this may be one of the triggers to initiate fruiting body morphogenesis. Accordingly, when the expression of three loci of cyclopropane fatty acid synthase are taken together, expression level was similar between mycelium and primordium, which generally agrees with the proposal of Liu et al. (2006) that the cfsl genes are superfluous during vegetative mycelial growth but is essential for progression of hyphal knots to fruiting body initials. In some feeding experiments, membrane-interactive compounds such as sucrose esters of fatty acids, and cerebrosides could induce fruiting body development in various basidiomycetes (Magae et al., 2004; Kawai, 1989),hence postulating that membrane alteration may serve as a stress signal that promotes transition from mycelium to fruiting body.

134 Chapter 4 General Discussion

Coprinopsis cinerea, the inky cap mushroom, is one of the model organisms widely used to study developmental processes in basidiomycetous fungi. Short life cycle, ease in cultivation and fruiting in laboratory and availability of the drafted genome assembly have much facilitated its genetic studies.

Not much is known about the molecular processes involved during fruiting body initiation and development of basidiomycetes, and the current annotations of the C. cinerea genome contain no data on expression patterns and transcription start sites (TSS) at a genome-wide level. While understanding the fruiting process had been a long term goal for scientists engaged in mushroom research, performing a comprehensive transcriptome analysis does not only provide possible insights into the fruiting mechanisms, but will also contribute to better annotation of the genome and our knowledge of transcription regulation.

This project represents the first comprehensive 5’ SAGE analysis on a basidiomycetous fungus. The mycelial and primordial stages of C. cinerea were analyzed to generate large amount of data on its transcriptome, gene differential expression patterns and transcription start sites. This method employs the terminal transferase activity of the reverse transcriptase and the characteristics of type lis restriction enzyme Mmel to generate ditags. Such ditag strategy ensures that each tag is originated from a different mRNA and is not an artifact of PCR amplification.

Therefore, multiple occurrence tags can reliably used to indicate expression abundance of individual genes. Combining with the powerful sequencing capability of the 454 genome sequencer GS20, the tedious steps of concatemerization, cloning

and colony picking in the original SAGE protocols can be avoided. The cost of

analyzing the transcriptome is also greatly reduced if the same amount of data is to

135 be generated through the traditional LongSAGE approach.

Moreover, the sequencing capacity provides high enough sensitivity to detect very rare transcripts. For instance, in the analysis of TSS, alternative start sites are observed for most of the abundantly expressed genes. Among the clusters of start sites, some of them occurred only once or twice, and yet are detectable from the 5'

SAGE libraries. Based on this rationale, genes having very low expression level are still likely to be identified. To further assess the usefulness of this increased sensitivity, the expression of the transcription factors, which are known to be of very low abundance, was surveyed. From both 5’ SAGE libraries, the expression of more than 60 genes identified as homologues of transcription factors was detected. Due to the automated BLAST-associated annotation of these genes, some more transcription factors are likely to be missed. Many of these transcription factors had an occurrence of only one or two, and while some of the mapped tags may be spurious, the ability of the 5' SAGE analysis to detect low abundance transcripts is highlighted.

The study successfully isolated a total of �40,00 0and �50,00 0unique tag sequences from the transcriptomes of the mycelial and primordial stages of C. cinerea respectively. Approximately 80% of these tags matched to a single position on the genome sequence and this is comparable to several previous studies (Zhang and Dietrich, 2005; Hashimoto et al., 2004; Wei et al., 2004). Among these tags, approximately 35-37% was located in the putative 5'-UTR, representing an estimate of �10,00 an0 d -15,000 genuine transcription start sites in each developmental stage.

The TSS data is crucial towards defining transcriptional units and analyses of the promoter and upstream regulatory elements. On the other hand, mapping of these unique tags revealed several phenomena. The existence of alternative TSS, which is widely present in many other organisms, is confirmed in C. cinerea. It was

136 suggested that alternative TSS is not merely a 'biological noise', but rather implies a new level of biological complexity within the core promoters (Kawaji et al, 2006).

Also, approximately 20% of the unique tags were mapped to gene-associated

positions in antisense orientation, representing either potential antisense transcripts

or unannotated genes as a result of extensive overlapping of transcriptional units in

eukaryotic genomes. Moreover, possible errors in the current genome annotations

can be rectified by the mapping data, as the TSSs may reveal inaccurate assignment

of the ATG codon to annotated exon-intron boundaries.

By considering the tags mapped to the putative 5,-UTR, the expression level of

genes can be evaluated. In this study, expression level of 6,736 genes was revealed,

and from which more than 1,000 differentially expressed genes were identified. The

C. cinerea is predicted to contain about 12,000-13,000 genes, and that means half of

the transcriptome had been covered. In the mycelial stage, the gene encoding a

mismatched base pair and cruciform DNA recognition protein was most highly

expressed. The protein was suggested to play a role in maintaining or inducing DNA

architecture (Dutta et al., 1997), but its exact role in the mycelium remains unknown.

For the primordial stage, the enzyme ribitol kinase had the highest expression,

although expression in the mycelium was not low either. The enzyme was reported

to be actively participating in riboflavin biosynthesis (Bacher and Lingens, 1970),

and this fits the general fact that many edible mushrooms are an excellent source of

riboflavin.

Investigation of the differentially expressed genes can provide insights into the

possible molecular events during fruiting initiation and development, which are of

great interest in mushroom research. Alternating periods of light source and nutrient

depletion are two major signals for initiation of fruiting. Therefore, genes

responsible for signaling of light and sensing of nutrient depletion were examined.

137 In this respect, the gene encoding homologue of Pdc and Rho GTPase were studied.

Pdc was suggested a specific role in light signaling in mammals (Willardson and

Hewlett, 2007) and its differential expression in the primordial stage also supports

its role in C. cinerea. For sensing of nutrient depletion, the potential role of the Rheb

GTPase was elucidated in terms of its ability to activate TOR in response to nutrient

availability. We also proposed that C, cinerea may employ a slightly different mode

for onset of sexual development, as judged from the dissimilarity of expression

patterns of MAP kinase and MAP kinase kinase compared to other fungi. In addition

to signaling of light and sensing of nutrient depletion, some other molecular events

were also investigated. An example is the alternation in membrane structure which

may serve as a stress signal to initiation the fruiting process. Nevertheless, the

understanding towards the whole fruiting machinery is still far from complete, and

substantial analyses will be required for existing players and discovery of new

potential elements.

Furthermore, during the expression analysis of the 5’ SAGE data, many genes

were found to be duplicated in the C. cinerea genome. The most prominent one is

the hydrophobins, with as many as 22 copies. Random searching of the genes for

which expression were detected reveals some more highly duplicated genes such as

metalloprotease (19 copies), ubiquitin-protein ligase (16 copies), aryl-alcohol

oxidase (7 copies), histone deacetylase (6 copies). In fact, for many genes, at least

2-3 copies were identified on the genome. Due to the automated assignment of the

BLAST-searched homologues to the gene, and that nomenclature of the same

protein may differ from organism to organism, the real occurrence of these

duplicated genes is likely to be underestimated. In this respect, Wapinski et al (2007)

examined gene duplication and loss events in 17 fungal genomes to resolve the

evolutionary history of all the genes. They commented that, in particular,

138 stress-related genes exhibit many gene duplications and losses, whereas growth-related genes show selection against such changes. By characterizing the functional fate of the duplicated genes, they also showed that these genes rarely diverge with respect to biochemical function, but instead typically diverge with respect to regulatory controls. Such speculations may explain why some members of the duplicated genes have higher expression over the others. Another possible explanation was suggested by Louis (2007) that the commonest consequence of gene duplication seems to be loss of all or part of the duplicated sequences through deletion or degeneration, resulting in non-functionality. Presumably, these non-functional genes may not be expressed and this is in line with the observation that a notable proportion of the duplicated genes have an occurrence of only 1 or 2, which is likely to be spuriously assigned.

Comparative genomics Previously, SAGE analysis on the mycelial and primordial stages for another basidiomycete Lentinula edodes (Chum et al., submitted) had also been performed in our laboratory, and a total of 3,545 unique tags were isolated. Although both C. cinerea and L. edodes belong to the same order,

Agaricales, they are only distinctly related phylogenetically, thus the SAGE data can be compared to the 5' SAGE data in this study, supplementing each other and for confirmation of universal or distinct molecular events especially those potentially important for fruiting among the two mushrooms.

Data from SAGE studies on L. edodes suggested that the 3 hydrophobins identified are a typical example of differential expression between the mycelial and primordial stages, with one differentially expresses in the primordium and the other two in the mycelium. The 5,SAGE data strongly agrees with this finding. A total of

22 genes encoding protein homologues of the hydrophobins including the CoHl and

139 CoH2 genes were identified, and 8 of them showed differential expression in the mycelium whereas 4 were up-regulated in the primordium. Interestingly, none of the

10 remaining hydrophobin genes had an occurrence higher than 5 in either developmental stages. This implies that C. cinerea employs two totally different sets of hydrophobins during its transition from mycelium to primordium, and well supports the notion that these fungal-specific hydrophobic proteins are essential for morphogenesis and pathogenesis in fungi and fruiting body development in mushrooms (Kershaw & Talbot, 1998). Wosten et al. (1999) also mentioned that the hydrophobins serve divergent functions, such as the formation of aerial structures, mediation of hyphal attachment on hydrophobic surfaces, and formation of hydrophobic rodlet layers on the fungal spore, hence suggesting the distinct roles of hydrophobins in the two developmental stages.

The gene encoding riboflavin (vitamin B2) aldehyde forming enzyme was heavily expressed in the primordium of L. edodes as observed from the SAGE data.

The enzyme oxidizes the 5' site of the ribityl moiety of riboflavin to an aldehyde group (thus known as Bi-aldehyde) and serves as a means for riboflavin biological degradation (Tachibana and Oka, 1981). On the contrary, in C. cinerea, a group of genes related to riboflavin biosynthesis including the ribitol kinase and riboflavin synthase genes were found to be highly and preferentially expressed in the primordial stage, thus suggesting a high synthesis rate of riboflavin at this stage. On the other hand, three loci encoding riboflavin aldehyde forming enzyme had increased expression in the primordium, but the difference was not significant and indeed expression levels were low. It was previously suggested that the B2-aldehyde forming enzyme may play an important role in fruiting body formation (Miyazaki et al., 2005), but as observed in C. cinerea in which the equilibrium is shifted towards

the riboflavin, the role of B2-aldehyde forming enzyme and the riboflavin

140 synthesizing enzymes during fruiting will require further characterization.

Observations for other basic biological processes are largely similar. The

SAGE data suggested an up-regulation of genes responsible for fatty acid biosynthesis including the acyl carrier protein (ACP) and fatty acid desaturase in primordium. It is proposed that an increased synthesis of fatty acids is essential for energy production, structural components and synthesis of lipo-proteins to support the active growth of the primordium (Chum et al, submitted). Meanwhile, the 5'

SAGE also recorded an up-regulation of a number of genes involved in fatty acid biosynthesis, including ACP, fatty acid desaturases, cyclopropane fatty acid synthase and fatty acid transporter. On the other hand, many genes responsible for stress response and intracellular trafficking were also observed to be differentially expressed in the primordial stage in both SAGE and 5' SAGE analyses.

Future Prospects The data generated in this study opens up new areas of analyses

at a genome-wide level. The sequences flanking the TSS can be isolated and aligned

to check for consensus pattern. For instance, the 5' SAGE study by Zhang and

Dietrich (2005) was able to correct and refine the previously reported consensus

sequence around the TSS. They also shown that the consensus was significantly

different from that of human and suggested that yeast might have an unusual mode

of transcription initiation. On the other hand, the sequences upstream of the TSS can

be used for promoter and potential upstream open reading frame (uORF) prediction

and analysis. Moreover, those unique tags matched in anti-sense orientation to

currently annotated genes and to positions where there are no annotations nearby are

worth investigation, as they may represent novel transcripts and potential regulatory

elements.

The expression data obtained provides information on differentially expressed

141 genes which may play a role during fruiting body initiation and development. The literature suggests a pool of candidates which are suggested to be actively participating in the fruiting process. Together with the availability of SAGE data

from another mushroom Lentinula edodes, comparison of their transcriptomes can

yield a number of potentially important players. Silencing these genes is one of the

effective means to elucidate their roles during the fruiting process. Given that gene

silencing using double-stranded RNA (Namekawa et al., 2005) or homologous

hairpin RNA (Walti et al., 2006) had been proven successful, and that corresponding

vectors and protocols are readily available, it is believed that gene silencing is a

powerful tool for elucidating the functions of various genes.

With the sequencing capacity of genome sequencers being pushed up to

100MB by the GS FLX (Roche), 2,000MB by Illumina Genetic Analyzer (Solexa)

and 3,000MB by SOLID DNA sequencer (ABI), simultaneous analysis of more

developmental stages is made possible through multiplexing of samples, such that a

more global picture of the fruiting process can be revealed. The lack of replicates, a

severe drawback of the SAGE technology (Nielsen et al,, 2006), can also be

addressed by this increased width of sequencing capacity.

Conclusion In this study, 5' SAGE libraries of the dikaryotic mycelial and

primordial stages of the basidiomycetes C. cinerea were successfully constructed to

generate large amount of data on its transcriptome, gene expression patterns and

TSS. The tags mapping process identified the presence of alternative TSS and

anti-sense transcripts and facilitated the discovery of potential novel genes and

regulatory elements, thus help improving the current genome annotations. On the

other hand, the expression levels of more than 6000 genes were revealed, and from

which more than 1,000 differentially expressed genes were identified, thereby

142 suggesting a significant switching of the transcriptomes between the two developmental stages. Through comparing the data with the expression profiles studied in other fungi, these differentially expressed genes serve as a valuable platform for understanding gene functions and various biological processes including the molecular events underlying the fruiting process in basidiomycetes.

143 References

1. Adams, M.D., Kelley, J.M., Gocayne, J.D., Dubnick, M., Polymeropoulos, M.H., Xiao, H.,Merril, C.R., Wu, A., Olde, B.,Moreno, R.F., et al 1991. Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252: 1651-1656.

2. Aimi, T., Fukuhara, S., Ishiguro, M., Kitamoto, Y. and Morinaga, T. 2004. Primary structure of dihydrofolate reductase and mitochondrial ribosomal protein L36 genes from the basidiomycete Coprinus cinereus. DNA Sequence 15: 291-298.

3. Albert, I.,Mavrich, T.N., Tomsho, L.R, Qi, J., Zanton, S.J.’ Schuster, S.C. and Pugh, B.F. 2007. Translational and rotational settings of H2A.Z nucleosomes across the genome. Nature 446: 572-276.

4. Alexandrov, N.N., Troukhan, M.E., Brover, V.V., Tatarinova, T., Flavell, R.B. and Feldmann, K.A. 2006. Features of Arabidopsis genes and genome discovered using full-length cDNAs. Plant Molecular Biology 60: 69-85.

5. Alwine, J.C., Kemp, D.J. and Stark, GR. 1977. Method for detection of specific RNAs in agarose gels by transfer to diazobenzyloxymethyl-paper and hybridization with DNA probes. Proceedings of the National Academy of Sciences USA 74: 5350-5354.

6. Arima, T., Yamamoto, M.’ Hirata, A., Kawano, S. and Kamada, T. 2004. The eln3 gene involved in fruiting body morphogenesis of Coprinus cinereus encodes a putative membrane protein with a general glycosyltransferase domain. Fungal Genetics and Biology 41: 805-812.

7. Arora, D. 1986. Mushrooms demystified. A comprehensive guide to the fleshy fungi, 2nd ed. Ten Speed Press, Berkeley, California.

8. Bainbridge, M.N., Warren, R.L., Hirst, M., Romanuik, T., Zeng, T.,Go, A., Delaney, A., Griffith, M., Hickenbotham, M.,Magrini, V. et al. 2006. Analysis of the prostate cancer cell line LNCaP transcriptome using a sequencing-by-synthesis approach. BMC Genomics 7: 246.

9. Bacher, A. and Lingens, F. 1970. Biosynthesis of riboflavin. Formation of

144 2,5-diamino-6-hydroxy-4-(l'-D-ribitylamino)pyrimidine in a riboflavin auxotroph. The Journal of Biological Chemistry 245: 4647-4652.

10. Bai, X.,Ma, D., Liu, A., Shen, X.’ Wang, Q.J., Liu, Y. and Jiang, Y. 2007. Rheb activates mTOR by antagonizing its endogenous inhibitor, FKBP38. Science 318: 977-980.

11. Ballou, L.R. and Holton, R.W. 1985. Synchronous initiation and spomlation of fruitbodies by Coprinus cinereus on a defined medium. Mycologia 77: 103-108.

12. Bartlett, J.M. 2002. Approaches to the analysis of gene expression using mRNA: a technical overview. Molecular Biotechnology 21: 149-160.

13. Bauer, P.H., Muller, S., Puzicha, M., Pippig, S., Obermaier, B., Helmreich, E.J.M. and Lohse, M.J. 1992. Phosducin is a protein kinase A-regulated G-protein regulator. Nature 358: 73-76.

14. Berezikov, E.,Thuemmler, R, van Laake, L.W., Kondova, I., Bontrop, R., Cuppen, E. and Plasterk, R.H. 2006. Diversity of microRNAs in human and chimpanzee brain. Nature Genetics 38: 1375-1377.

15. Bertossa, R.C., Kues, U., Aebi, M. and Kunzler, M. 2004. Promoter analysis of cgl2, a galectin encoding gene transcribed during fruiting body formation in Coprinopsis cinerea {Coprinus cinereus). Fungal Genetics and Biology 41: 1120-1131.

16. Bian, X.L.,Ng, W.L. and Kwan, H.S. 2001. Isolation of genes differentially expressed genes in primordium and studies of their expression during fruit body development of Shiitake mushroom Lentinula edodes. 101th General Meeting of American Society for Microbiology. Abstract.

17. Bianchi, M.E., Beltrame, M. and Paonessa, G. 1989. Specific recognition of cruciform DNA by nuclear protein HMGl. Science 243: 1056-1059.

18. Binninger, D.M., Skrzynia, C., Pukkila, P.J. and Casselton, L.A. 1987. DNA-mediated transformation of the basidiomycete Coprinus cinereus. The EMBO Journal 6: 835-840.

145 19. Boulianne, R.P., Liu, Y., Aebi, M.’ Lu, B.C. and Kues, U. 2000. Fruiting body development in Coprinus cinereus: regulated expression of two galectins secreted by a non-classical pathway. Microbiology 146: 1841-1853.

20. Bratkovskaja, I., Vidziunaite, R. and Kulys, J. 2004. Oxidation of phenolic compounds by peroxidase in the presence of soluble polymers. (Moscow) 69: 985-992.

21. Brenner, S.,Johnson, M., Bridgham, J., Golda, G, Lloyd, D.H., Johnson, D., Luo, S., McCurdy, S., Foy, M.,Ewan, M., et al. 2000. Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nature Biotechnology 18: 630-634.

22. Broad Institute. 2007. Coprinus cinereus project information. Retrieved July 18, 2007 from the World Wide Web: http://www.broad.mit.edu/annotation/fungi/ coprinus-cinereus/ background.html

23. Buller A.H.R. 1931. Researches on fungi, IV. Further observations on the Coprinus together with some observations on social organization and sex in the hymenomycetes. Hafner Publishing Co., New York, N.Y.

24. Casselton L.A. 1995. Genetics of Coprinus, p35-48. In U. Kuck (ed.),The mycota, vol. II. Genetics and biotechnology. Springer-Verlag KG, Berlin, Germany.

25. Casselton, L.A. and Econoumou, A. 1985. Dikaryon formation, p.213-229. In Moore, D., Casselton, L.A., Wood, D.A. and Frankland, J.C. (ed.), Developmental biology of higher fungi. Cambridge University Press, Cambridge, United Kingdom.

26. Casselton, L.A. and Olesnicky, N.S. 1998. Molecular genetics of mating recognition in basidiomycetes fungi. Microbiology and Molecular Biology Reviews 62: 55-70.

27. Chang, S.T., Buswell, J.A. and Miles, P.G. (ed.) 1993. Genetics and breeding of edible mushrooms. Gordon and Breach Science Publishers, Y-Parc, Switzerland.

28. Chen, J., Sun, M., Lee, S., Zhou, G., Rowley, J.D. and Wang, S.M. 2002. 146 Identifying novel transcripts and novel genes in the human genome by using novel SAGE tags. Proceedings of the National Academy of Sciences U.S.A 99:12257-62.

29. Cheung, R, Haas, B.J., Goldberg, S.M., May, G.D., Xiao, Y. and Town, C.D. 2006. Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology. BMC Genomics 7: 272.

30. Chum, W.Y., Ng, T.P. and Kwan, H.S. 2002. Serial Analysis of Gene Expression of differentially expressed gene profile of Shiitake mushroom Lentinula edodes. 102nd General Meeting of American Society for Microbiology. Abstract.

31. Cimica, V., Batusic, D., Haralanova-Ilieva, B., Chen, Y., Hollemann, T., Pieler, T. and Ramadori, G. 2007. Serial analysis of gene expression (SAGE) in rat liver regeneration. Biochemical and biophysical research communications 31: 545-552.

32. Cooper, D.N.W., Boulianne, R.P., Charlton, S., Farrell, E.M., Sucher, A. and Lu, B.C. 1997. Fungal galectins: sequence and specificity of two isolectins from Coprinus cinereus. The Journal of Biological Chemistry 272: 1514-1521.

33. Cummings, W.J.,Celerin, M., Crodian, J., Brunick, L.K. and Zolan, M.E. 1999. Insertional mutagenesis in Coprinus cinereus.. use of a dominant selectable marker to generate tagged, sporulation-defective mutants. Current Genetics 36: 371-382.

34. De Backer, M.D., Raponi, M. and Amdt, GM. 2002. RNA-mediated gene silencing in non-pathogenic and pathogenic fungi. Current Opinion in Microbiology 5: 323-329.

35. Ducros, V.,Brzozowski, A.M., Wilson, K.S., Ostergaard, P., Schneider, P., Svendson, A. and Davies, G.J. 2001. Structure of the laccase from Coprinus cinereus at 1.68A resolution: evidence for different 'type 2 Cu-depleted' isoforms. Acta Crystallograhica D Biological Crystallography 57: 333-336.

36. Dutta, S.,Gerhold, D.L., Rice, M., Germann, M. and Kmiec, E.B. 1997. The cloning and overexpression of a cruciform binding protein from Ustilago maydis. Biochimica et Biophysica acta 1352: 258-266.

147 37. Einck, L. and Bustin, M. 1985. The intracellular distribution and function of the high mobility group chromosomal proteins. Experimental Cell Research 156: 295-310.

38. Emrich, S.J., Barbazuk, W.B., Li, L. & Schnable, P.S. 2007. Gene discovery and annotation using LCM-454 transcriptome sequencing. Genome Research 17: 69-73.

39. Freedman, T. & Pukkila, P. J. 1993. De novo methylation of repeated sequences in Coprinus cinereus• Genetics 135: 357-366.

40. Gibbings, J.G., Cook, B.R, Dufault, M.R., Madden, S.L., Khuri, S.’ Tumbull, CJ. and Dunwell, J.M. 2003. Global transcript analysis of rice leaf and seed using SAGE technology. Plant Biotechnology Journal 1: 271-285.

41. Gibson, U.E., Heid, C.A. and Williams, P.M. 1996. A novel method for real time quantitative RT-PCR. Genome research 6: 995-1001.

42. Gowda, M., Li, H., Alessi, J., Chen, F., Pratt, R. & Wang, G.L. 2006. Robust analysis of 5'-transcript ends (5,-RATE): a novel technique for transcriptome analysis and genome annotation. Nucleic Acids Research 34: el26.

43. Granado, J.D., Kertesz-Chaloupkova, K., Aebi, M. and Kues, U. 1997. Restriction enzyme-mediated DNA integration in Coprinus cinereus. Molecular & General Genetics 256: 28-36.

44. Guillot, J. and Konska, G. 1997. Lectins in higher fungi. Biochemistry of System Ecology 25: 203-230.

45. Han, B.,Toyomasu, T. and Shinozawa, T. 1999. Induction of apoptosis by Coprinus disseminatus mycelial culture broth extract in human cervical carcinoma cells. Cell Structure and Function 24: 209-215.

46. Hanai, H., Ishida, S.’ Saito, C.’ Maita, T., Kusano, M., Tamogami, S. and Noma, M. 2005. Stimulation of mycelia growth in several mushroom species by rice husks. Bioscience Biotechnology and Biochemistry 69:123-127.

148 47. Hashimoto, S.’ Suzuki, Y.,Kasai, Y.,Morohoshi, K.,Yamada, T., Sese, J., Morishita, S•,Sugano, S. and Matsushima, K. 2004. 5'-end SAGE for the analysis of transcriptional start sites. Nature Biotechnology 22: 1146-1149.

48. Heid, C.A., Stevens, J., Livak, K.J. and Williams, P.M. 1996. Real time quantitative PGR. Genome research 6: 986-994.

49. Hiscock, S.J. and Kiies, U. 1999. Cellular and molecular mechanisms of sexual incompatibility in plants and fungi. International Review of Cytology 193: 165-295.

50. Hoegger, PJ., Navarro-Gonzalez, M., Kilaru, S., Hoffmann, M., Westbrook, E.D. and Kiies, U. 2004. The laccase gene family in Coprinopsis cinerea (Coprinus cinereus). Current Genetics 45: 9-18.

51. Hofreuter, D., Tsai, J., Watson, R.O., Novik, V., Altman, B., Benitez, M.,Clark, C., Perbost, C.,Jarvie, T., Du, L. and Galan, J.E. 2006. Unique features of a highly pathogenic Campylobacter jejuni strain. Infection and Immunity 74: 4694-4707.

52. Houborg, K., Harris, P., Poulsen, J.C., Schneider, P., Svendsen, A. and Larsen, S. 2003. The structure of a mutant enzyme of Coprinus cinereus peroxidase provides an understanding of its increased thermostability. Acta Crystallographica D Biological Crystallography 59: 997-1003.

53. Ikehata, K., Buchanan, I.D., Pickard, M.A. and Smith, D.W. 2005. Purification, characterization and evaluation of extracellular peroxidase from two Coprinus species for aqueous phenol treatment. Bioresource technology 96: 1758-1770.

54. Ikehata, K.,Pickard, M.A., Buchanan, I.D. and Smith, D.W. 2004. Optimization of extracellular fungal peroxidase production by 2 Coprinus species. Canadian Journal of Microbiology 50: 1033-1040.

55. Jacob, F. and Monad, J. 1961. Genetic regulatory mechanisms in the synthesis of proteins. Journal of Molecular Biology 3: 318-356.

56. Kamada, T. 2002. Molecular genetics of sexual development in the mushroom Coprinus cinereus. Bioessays 24: 449-459.

149 57. Kamada, T.,Kurita, R. and Takemaru, T. 1978. Effects of light on basidiocarp maturation in Coprinus macrorhizus. Plant Cell Physiology 19: 263-275.

58. Kamada, T., and Tsuru, M. 1993. The onset of the helical arrangement of chitin microfibrils in fruit-body development of Coprinus cinereus. Mycological Research 97: 884-888.

59. Kapranov, P., Willingham, A.T. and Gingeras, T.R. 2007. Genome-wide transcription and the implications for genomic organization. Nature Review Genetics 8: 413-423.

60. Kasai, Y.,Hashimoto, S., Yamada, T.,Sese, J., Sugano, S., Matsushima, K. and Morishita, S. 2005. 5'SAGE: 5'-end Serial Analysis of Gene Expression database. Nucleic Acids Research 33: D550-D552.

61. Katayama, S., Tomaru, Y., Kasukawa, T., Waki, K.,Nakanishi, M., Nakamura, M.,Nishida, H., Yap, C.C., Suzuki, M., Kawai, J., et al. 2005. Antisense transcription in the mammalian transcriptome. Science 309: 1564-1566.

62. Kawai, G. 1989. Molecular species of cerebrosides in fruiting bodies of Lentinus edodes and their biological activity. Biochimica et Biophysica Acta 1001: 185-190.

63. Kawaji, H., Frith, M.C., Katayama, S., Sandelin, A., Kai, C.’ Kawai, J., Caminci, P. and Hayashizaki, Y. 2006. Dynamic usage of transcription start sites within core promoters. Genome Biology 7: R118.

64. Keime, .C, Semon, M., Mouchiroud, D., Duret, L. and Gandrillon, O. 2007. Unexpected observations after mapping LongSAGE tags to the human genome. BMC Bioinformatics 8:154.

65. Kershaw, M. J. and Talbot, N. J. 1998. Hydrophobins and repellents: proteins with fundamental roles in fungal morphogenesis. Fungal Genetics and Biology 23: 18-33

66. Kikuchi, M.,Kitamoto, N. and Shishido, K. 2004. Secretory production of Aspergillus oryzae xylanase XynFl, xynFl cDNA product, in the basidiomycetes Coprinus cinereus. Applied Microbiological Biotechnology 63: 728-733 150 67. Kiyosawa, H.’ Yamanaka, I.,Osato, N., Kondo, S.,Hayashizaki, Y. RIKEN GER Group and GSL Members. 2003. Antisense transcripts with FANT0M2 clone set and their implications for gene regulation. Genome Research 13: 1324-1334.

68. Klibanov, A.M., Kaplan, N.O. and Kamen, M.D. 1980. Thermal stabilities of membrane-bound, solubilized, and artificially immobilized hydrogenase from Chromatium vinosum. Archives of Biochemistry and Biophysics 199: 545-549.

69. Kothe, E. 2001. Mating-type genes for basidiomycete strain improvement in mushroom farming. Applied Microbiology and Biotechnology 56: 602-612.

70. Kronstad, J.W. and Staben, C. 1997. Mating type in filamentous fungi. Annual Review of Genetics 31: 245-276.

71. Kuhad, R.C., Rosin, I.V. and Moore, D. 1987. A possible relation between cyclic-AMP levels and glycogen mobilisation in Coprinus cinereus. Transaction of the British Mycological Society. 88: 229-236.

72. Kiies, U. 2000. Life history and development processes in the basidiomycete Coprinus cinereus. Microbiology and Molecular Biology Reviews 64: 316-353.

73. Kiies, U.,Granado, J.D., Hermann, R., Boulianne, R.P., Kertesz-Chaloupkova, K. and Aebi, M. 1998. The A mating type and blue light regulate all known differentiation processes in the basidiomycetes Coprinus cinereus. Molecular and General Genetics 260: 81-91.

74. Kues, U., Walser, P.J., Klaus, M.J. and Aebi, M. 2002. Influence of activated A and B mating-type pathways on developmental processes in the basidiomycete Coprinus cinereus. Molecular Genetics and Genomics 268: 262-271.

75. Kwan, H.S., Chum, W.Y., Bian, X.L., Xie, W.J., Ng, T.P., Ng, W.L. and Zhang, M. 2003. Gene expression profiles of Shiitake mushroom Lentinula edodes revealed by differential display, microarray and serial analysis of gene expression. 103th General Meeting of American Society for Microbiology. Washington, D.C.,U.S.A. Abstract.

76. Labarere, J. and Bemet, J. 1978. Mutation inhibiting protoplasmic incompatibility in Podospora anserine that suppresses an extracellular laccase

151 and protoperithecium formation. Journal of General Microbiology 109: 187-189.

77. Langfelder, K., Streibel, M.,Jahn, B., Haase, G and Brakhage, A.A. 2003. Biosynthesis of fungal melanins and their importance for human pathogenic fungi. Fungal Genetics and Biology 38: 143-158.

78. Leatham, GF. and Stahmann, M.A. 1981. Studies on the laccase of Lentinus edodes: specificity, localization and association with the development of fruiting bodies. Journal of General Microbiology 125: 147-157.

79. Lee, R.H., Brown, B.M. and Lolley, R.N. 1984. Light-induced dephosphorylation of a 33K protein in rod outer segments of rat retina. Biochemistry 23: 1972-1977.

80. Leonowicz, A., Cho, N.S., Luterek, J., Wilkolazka, A., Wojtas-Wasilewska, M., Matuszewska, A., Hofrichter, M., Wesenberg, D. and Rogalski, J. 2001. Fungal laccase: properties and activity on lignin. Journal of Basic Microbiology 41:185-227.

81. Levine, M. and Davidson, E.H. 2005. Gene regulatory networks for development. Proceedings of the National Academy of Sciences U.S.A. 102: 4936-4942.

82. Li, L.,Gerecke, E.E. and Zolan, M.E. 1999. Homolog pairing and meiotic progression in Coprinus cinereus. Chromosoma 108: 384-392.

83. Liang, P. and Pardee, A.B. 1992. Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction. Science 257: 967-971.

84. Liu, Y.,Srivilai, P., Loos, S.,Aebi, M. and Kues, U. 2006. An essential gene for fruiting body initiation in the basidiomycete Coprinopsis cinerea is homologous to bacterial cyclopropane fatty acid synthase genes. Genetics 172: 873-884.

85. Louis, E.J. 2007. Evolutionary genetics: making the most of redundancy. Nature 449: 673-674.

86. Lu, B.C. 1972. Dark dependence of meiosis at elevated temperatures in the

152 basidiomycetes Coprinus cinereus. Journal of Bacteriology 111: 833-834.

87. Lu, B.C. 1974. Genetic recombination in Coprinus. IV. A kinetic study of the temperature effect on recombination frequency. Genetics 78: 661-677.

88. Lu, B.C. 1974. Meiosis in Coprinus. V. The role of light on basidiocarp initiation, mitosis and hymenium differentiation in Coprinus lagopus. Canadian Journal of Botany. 52: 299-305.

89. Lu, B.C. 2000. The control of meiosis progression in the fungus Coprinus cinereus by light/dark cycles. Fungal Genetics and Biology 31: 33-41.

90. Madelin, M.F. 1956. Studies on the nutrition of Coprinus lagopus Fr.,especially as affecting fruiting. Annals of Botany. 20: 307-330.

91. Magae, Y.,Nishimura, T. and Ohara, S. 2004. 3-O-alkyl-D-glucose derivatives induce fruit bodies of Pleurotus. Mycological Ressearch 109: 374-376.

92. Maida, S.,Fujii, M., Skrzynia, C., Pukkila, P.J. and Kamada, T. 1997. A temperature-sensitive mutation of Coprinus cinereus, hytl-1, that causes swelling of hyphal tips. Current Genetics 32: 321-326.

93. Malig, R.,Varela, C.,Agosin, E. and Melo, F. 2006. Accurate and unambiguous tag-to-gene mapping in serial analysis of gene expression. BMC Bioinformatics 7: 487-504.

94. Malinen, E.’ Kassinen, A. Rinttila, T. and Palva, A. 2003. Comparison of real-time PCR with SYBR Green I or 5'-nuclease assays and dot-blot hybridization with rDNA-targeted oligonucleotide probes in quantification of selected faecal bacteria. Microbiology 149: 269-277.

95. Mao, X., Buchanan, I.D. and Stanley, S.J. 2006. Development of an integrated enzymatic treatment system for phenolic waste streams. Environmental Technology 27: 1401-1410.

96. Margulies, M.,Egholm, M., Altman, W.E., Attiya, S.,Bader, J.S., Bemben, L.A.,Berka, J., Braverman, M.S., Chen, Y.J., Chen, Z.’ et al 2005. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437: 376-380.

153 97. Martin, D.E. and Hall, M.N. 2005. The expanding TOR signaling network. Current Opinion in Cell Biology 17: 158-166.

98. Maruyama, K. and Sugano, S. 1994. Oligo-capping: a simple method to replace the cap structure of eukaryotic mRNAs with oligoribonucleotides. Gent 138: 171-174.

99. Matsumura, H., Reich, S.,Ito, A., Saitoh, H., Kamoun, S., Winter, P., Kahl, G., Reuter, M.,Kruger, D.H. and Terauchi, R. 2003. Gene expression analysis of plant host-pathogen interactions by SuperSAGE. Proceedings of the National Academy of Sciences U.S.A. 100: 15718-15723.

100. Matthews, T.R.,and Niederpruem, D.J. 1973. Differentiation in Coprinus lagopus. II. Histology and ultrastructural aspects of developing primordia. Archivfur Mikrobiologie. 88:169-180.

101. Mattick, J.S. and Makunin, I.V. 2005. Small regulatory RNAs in mammals. Human Molecular Genetics 14: 121-132.

102. Mehta, S.U., Mattoo, A.K and Modi, V.V. 1972. Ribitol and flavinogenesis in Eremothecium ashbyii. The Biochemical Journal 130: 159-166.

103. Mellon, F.M. and Casselton, L.A. 1988. Transformation as a method of increasing gene copy number and gene expression in the basidiomycete fungus Coprinus cinereus. Current Genetics 14: 451-456.

104. Miura, F., Kawaguchi, N.,Sese, J., Toyoda, A., Hattori, M., Morishita, S. and Ito, T. 2006. A large-scale full-length cDNA analysis to explore the budding yeast transcriptome. Proceedings of the National Academy of Sciences U.S.A. 107:17846-51.

105. Miyazaki, Y., Nakamura, M. and Babasaki, K. 2005. Molecular cloning of developmentally specific genes by representational difference analysis during the fruiting body formation in the basidiomycete Lentinula edodes. Fungal Genetics and Biology 42: 493-505.

106. Moore, D. 1981. Developmental genetics of Coprinus cinereus: genetic evidence that carpophores and sclerotia share a common pathway of initiation. Current Genetics 3: 145-150. 154 107. Moore, D. 1995. Tissue formation, p. 423-466. In Gow, N.A.R. and Gadd G.M. (ed.),The growing fungus. Chapman & Hall, London, United Kingdom.

108. Moore, D. 1996. Graviresponses in fungi. Advanced Space Research 17: 73-82.

109. Morimoto, N.,Suda, S. and Sagara, N. 1981. Effect of ammonia on fruitbody induction of Coprinus cinereus in darkness. Plant Cell Physiology 22: 247-254.

110. Morrison, T.B., Weis, J.J. and Wittwer, C.T. 1998. Quantitation of low-copy transcripts by continuous SYBR Green I monitoring during amplification. BioTechniques 24: 954-962.

111. Munoz-Rivas, A., Specht, C.A., Drummond, B.J., Froeliger, E.,Novotny, C,P, and Ullrich. R.C. 1986. Transformation of the basidiomycete, Schizophyllum commune. Molecular and General Genetics 205: 103-106.

112. Muraguchi, H. and Kamada, T. 1998. The ichl gene of the mushroom Coprinus cinereus is essential for pileus formation in fruiting. Development 125: 3133.-3141.

113. Namekawa, S.,Hamada, F.’ Ishii, S., Ichijima, Y., Yamaguchi, T.’ Nam, T.’ Kimura, S., Ishizaki, T., Iwabata, K.’ Koshiyama, A., Teraoka, H. and Sakaguchi, K. 2003. Coprinus cinereus DNA ligase I during meiotic development. Biochimica et Biophysica Acta 1627: 47-55.

114. Namekawa, S.,Hamada, F.’ Sawado, T.’ Ishii, S.’ Nam, T.,Ishizaki, T., Ohuchi, T., Arai, T. and Sakaguchi, K. 2003. Dissociation of DNA polymerase alpha-primase complex during meiosis in Coprinus cinereus. European Journal of Biochemistry 270: 2137-2146.

115. Namekawa, S.’ Ichijima, Y., Hamada, R, Kasai, N., Iwabata, K., Nam, T., Teraoka, H.,Sugawara, F. & Sakaguchi, K. 2003. DNA ligase IV from a basidiomycete, Coprinus cinereus, and its expression during meiosis. Microbiology 149: 2119-2128.

116. Namekawa, S.H., Iwabata, K.,Sugawara, H., Hamada, F.N., Koshiyama, A., Chiku, H.,Kamada, T. and Sakaguchi, K. 2005. Knockdown of LIM15/DMC1 in the mushroom Coprinus cinereus by double-stranded RNA-mediated gene

155 silencing. Microbiology 151: 3669-3678.

117. Nam, T.,Saka, T.’ Sawado, T., Takase, H., Ito, Y.,Hotta, Y. and Sakaguchi, K. 1999. Isolation of a LIM15/DMC1 homolog from the basidiomycete Coprinus cinereus and its expression in relation to meiotic chromosome pairing. Molecular and General Genetics 262: 781-789.

118. Ng, R, Wei, C.L., Sung, W.K., Chiu, K.P., Lipovich, L.,Ang, C.C.,Gupta, S., Shahab, A., Ridwan, A., Wong, C.H., Liu, E.T. and Ruan, Y. 2005. Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation. Nature Methods 2: 105-111.

119. Nielsen, K.L., H0gh, A.L. & Emmersen,J. 2006. DeepSAGE- digital transcriptomics with high sensitivity, simple experimental protocol and multiplexing of samples. Nucleic Acids Research 34: el33.

120. Okazaki, N., Okazaki, K., Watanabe, Y., Kato-Hayashi, M., Yamamoto, M. and Okayama, H. 1998. Novel factor highly conserved among eukaryotes controls sexual development in fission yeast. Molecular and Cellular Biology 18: 887-895.

121. O'Shea, S.F., Chaure, P.T., Halsall, J.R., Olesnicky, N.S., Leibbrandt, A., Connerton, I.F. and Casselton, L.A. 1998. A large pheromone and receptor gene complex determines multiple B mating type specificities in Coprinus cinereus. Genetics 148: 1081-1090.

122. Petersen, J.F., Kadziola, A. and Larsen, S. 1994. Three-dimensional structure of a recombinant peroxidase from Coprinus cinereus at 2.6 A resolution. FEES Letters 339: 291-296.

123. Pukkila, PJ., Yashar, B.M. and Binninger, D.M. 1984. Analysis of meiotic development in Coprinus cinereus. Symposia of the Society for Experimental Biology 38: 177-194.

124. Pukklia, P.J. and Lu, B.C. 1985. Silver staining of meiotic chromosomes in the fungus Coprinus cinereus. Chromosoma 91: 108-112.

125. Ramesh, M.A. and Zolan, M.E. 1995. Chromosome dynamics in radl2 mutants of Coprinus cinereus. Chromosoma 104: 189-202. 156 126. Ramos-Payan, R., Aguliar-Medina, M., Estrada-Parra, S., Gonzalez, Y.M.J.A., Favila-Castillo, L., Monroy-Ostria, A. and Estrada-Garcia, I.C. 2003. Quantification of cytokine gene expression using an economical real-time polymerase chain reaction method based on SYBR Green 1. Scandinavian Journal of Immunology 57: 439-445.

127. Reue, K. 1998. mRNA quantitation techniques: considerations for experimental design and application. Journal of Nutrition 128 : 2038-2044.

128. Riquelme, M., Challen, M.P., Casselton, L.A. and Brown, A.J. 2005. The origin of multiple B mating specificities in Coprinus cinereus. Genetics 170: 1105-1119.

129. Ririe, K.M., Rasmussen, R.R and Wittwer, C.T. 1997. Product differentiation by analysis of DNA melting curves during the polymerase chain reaction. Analytical Biochemistry 245: 154-160.

130. Ronaghi, M.,Karamohamed, S.,Pettersson, B.,Uhlen, M. and Nyren, P. 1996. Real-time DNA sequencing using detection of pyrophosphate release. Analytical Biochemistry 242 84-89.

131. Saha, S.,Sparks, A.B., Rago, C., Akmaev, V., Wang, C.J., Vogelstein, B., Kinzler, K.W. and Velculescu, V.E. 2002. Using the transcriptome to annotate the genome. Nature Biotechnology 20: 508-512.

132. Schmidt, W.M. & Mueller, M.W. 1999. CapSelect: A highly sensitive method for 5' CAP-dependent enrichment of full-length cDNA in PCR-mediated analysis of mRNAs. Nucleic Acids Research 27: e31.

133. Seitz, L.C.,Tang, K., Cummings, W.J. and Zolan, M.E. 1996. The rad9 gene of Coprinus cinereus encodes a proline-rich protein required for meiotic chromosome condensation and synapsis. Genetics 142: 1105-1117.

134. Shih, S.M., Ng, TP., Bian, X.L. and Kwan, U.S. 2003. Identification and characterization of differentially genes in dikaryons of Lentinula edodes by cDNA microarray. 103th General Meeting of American Society for Microbiology. Abstract.

135. Shiraki, T., Kondo, S., Katayama, S., Waki, K., Kasukawa, T., Kawaji, H.,

157 Kodzius, R., Watahiki, A., Nakamura, M., Arakawa, T., Fukuda, S.,Sasaki, D., Podhajska, A., Harbers, M.’ Kawai, J., Caminci, P. and Hayashizaki, Y. 2003. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proceedings of the National Academy of Sciences U.S.A. 100: 15776-15781.

136. Siu, I.M., Lai, A. and Riggins, G.J. 2001. A database for regional gene expression in the human brain. Brain research. Gene expression patterns 1: 33-38.

137. So, W.K., Kwok, H.F. and Ge, W. 2005. Zebrafish gonadotropins and their receptors: II. Cloning and characterization of zebrafish follicle-stimulating hormone (FSH) and luteinizing hormone (LH) subunits - Their spatial-temporal expression patterns and receptor specificity. Biology of Reproduction 72:1382-1396.

138. Stanton, L.W. 2001. Methods to profile gene expression. Trends in Cardiovascular Medicine 11: 49-54.

139. Szeto, C.Y., Leung, GS. and Kwan, H.S. 2007. Le.MAPK and its interacting partner, Le.DRMIP, in fruiting body development in Lentinula edodes. Gene 393: 87-93.

140. Tachibana, S. and Oka, M. 1981. Occurrence of a vitamin B2-aldehyde-forming enzyme in Schizophyllum commune. The Journal of Biological Chemistry 256: 6682-6685.

141. Thatcher, L.F., Carrie, C., Andersson, C.R., Sivasithamparam, K., Whelan, J. and Singh, K.B. 2007. Differential gene expression and subcellular targeting of Arabidopsis glutathione S-transferase F8 is achieved through alternative transcription start sites. The Journal of Biological Chemistry 282: 28915-28928.

142. Thomas, R.K., Nickerson, E.,Simons, J.F., Janne, RA.’ Tengs, T., Yuza, Y., Garraway, L.A., LaFramboise, T., Lee, J.C. and Shah, K. et al 2006. Sensitive mutation detection in heterogeneous cancer specimens by massively parallel picoliter reactor sequencing. Nature Medicine 12: 852-855.

143. Tsukahara, K.,Yamamoto, H. and Okayama, H. 1998. An RNA binding protein

158 negatively controlling differentiation in fission yeast. Molecular and Cellular Biology 18:4488-4498.

144. Valentine, G, Wallace, Y.J•,Turner, F.R. and Zolan, M.E. 1995. Pathway analysis of radiation-sensitive meiotic mutants of Coprinus cinereus. Molecular and General Genetics 247: 169-179.

145. Van Ruissen, F.,Ruijter, J.M., Schaaf, GJ.,Asghamegad, L.’ Zwijnenburg, D.A., Kool, M, and Baas, R 2005. Evaluation of the similarity of gene expression data estimated with SAGE and Affymetrix GeneChips. BMC Genomics 6: 91.

146. Velculescu, V.E., Zhang, L.,Vogelstein, B. and Kinzler, K.W. 1995. Serial analysis of gene expression. Science 270: 484-487.

147. Wahl, M.B., Heinzmann, U. and Imai, K. 2005. LongSAGE analysis significantly improves genome annotation: identification of novel genes and alternative transcripts in the mouse. Bioinformatics 21: 1393-1400.

148. Walti, M.A., Villalba, C.,Buser, R.M., Grunler, A., Aebi, M. and Kunzler, M. 2006. Targeted gene silencing in the model mushroom Coprinopsis cinerea {Coprinus cinereus) by expression of homologous hairpin RNAs. Eukaryotic Cell 5: 732-744.

149. Wang, S.M. 2006. Understanding SAGE data. Trends in Genetics 23: 42-50.

150. Wang, T. and Brown, M.J. 1999. mRNA quantitation by real time TaqMan polymerase chain reaction: validation and comparison with RNase protection. Analytical Biochemistry 269: 198-201.

151. Wang, H.X, Bun, T.N. and Ooi, V.E.C. 1998. Lectins from mushroom. Mycological Research 102: 897-906.

152. Wapinski, I., Pfeffer, A., Friedman, N. and Regev, A. 2007. Natural history and evolutionary principles of gene duplication in fungi. Nature 449: 54-61.

153. Weber, A.R,Weber, K.L.’ Carr, K.,Wilkerson, C. and Ohlrogge, J.B. 2007. Sampling the Arabidopsis transcriptome with massively parallel pyrosequencing. Plant Physiology 144: 32-42. 159 154. Wei, CL.,Ng, P., Chiu, K.P., Wong, C.H.,Ang, C.C, Lipovich, L., Liu, E.T. and Ruan Y. 2004. 5' Long serial analysis of gene expression (LongSAGE) and 3' LongSAGE for transcriptome characterization and genome annotation. Proceedings of the National Academy of Sciences U.S.A. 101: 11701-11706.

155. Willardson, B.M. and Hewlett, A.C. 2007. Function of phosducin-like proteins in G protein signaling and chaperone-assisted protein folding. Cellular Signalling 19: 2417-2427.

156. Wong, M.L. and Medrano, J.F. 2005. Real-time PGR for mRNA quantitation. BioTechniques 39: 75-85.

157. Wosten, H.A.B., Richter, M. and Willey, J.M. 1999. Structural proteins involved in emergence of microbial aerial hyphae. Fungal Genetics and Biology 27: 153-160.

158. Zhang, S.,Sakuradani,E., Ito, K. and Shimizu, S. 2007. Identification of a novel bifunctional deltal2/deltal5 fatty acid desaturase from a basidiomycete, Coprinus cinereus FEES Letters 581: 315-319.

159. Zhang, Y.,Liu, X.S., Liu, Q.R. and Wei, L. 2006. Genome-wide in silico identification and analysis of cis natural antisense transcripts (cis-NATs) in ten species. Nucleic Acids Research 34: 3465-3475.

160. Zhang, Z. and Dietrich, F.S. 2005. Mapping of transcription start site in Saccharomyces cerevisiae using 5' SAGE. Nucleic Acids Research 33: 2838-2851.

161. Zolan, M.E., Crittenden, J.R., Heyler, N.K. and Seitz, L.C. 1992. Efficient isolation and mapping of rad genes of the fungus Coprinus cinereus using chromosome-specific libraries. Nucleic Acids Research 20: 3993-3999.

162. Zolan, M.E., Heyler, N.K. and Ramest, M.A. 1993. Gene mapping using marker chromosomes in Coprinus cinereus. In: Baltz, R.H., Hegeman, G.D. and Skatrud, P.L.,(eds). Industrial Microorganisms: Basic and Applied Molecular Genetics. Washington, D.C.: American Society for Microbiology 31-35.

160 Appendix

Table A Summary of the 151 (out of 358) differentially expressed genes without protein homologues in the mycelial stage

Occurrence*^ No.a GLEAN model b Fold difference ^ Fisher p-value ‘ Myc Pri

4. ^070 3747 15 94 n noR+nn 7 6650 617 23 33.43 1.39E-195 8 9275 2082 975 2.66 8.11E-185 11 11350 415 1 517.22 7.27E-154 12 3725 589 82 8.95 6.99E-134 13 8468 343 0 854.97 3.09E-129 14 11577 312 3 129.62 1.31E-11] 16 2814 383 48 9.94 2.01E-91 17 6946 344 43 9.97 3.02E-82 18 6473 233 5 58.08 2.17E-79 19 8241 194 2 120.89 1.44E-69 20 7898 171 1 213.12 9.26E-63 21 12520 167 2 104.07 1.40E-59 23 7037 234 30 9.72 5.64E-56 24 1500 151 3 62.73 3.37E-52 26 863 100 0 249.26 3.65E-38 27 5354 97 2 60.45 7.97E-34 28 7036 239 81 3.68 5.81E-32 29 8189 121 17 8.87 1.25E-28 31 11825 73 0 181.96 4.71E-28 32 11344 77 2 47.98 1.57E-26 35 12499 219 85 3.21 9.29E-26 36 8486 131 27 6.05 9.32E-26 37 11089 200 73 3.41 4.26E-25 38 10726 65 0 162.02 4.66E-25 39 1787 61 0 152.05 1.46E-23 45 6494 48 0 119.65 1.08E-18 46 4001 75 10 9.35 1.54E-18 48 8240 43 0 107.18 8.03E-17 49 8507 42 0 104.69 1.90E-16 51 6947 100 33 3.78 1.73E-14 53 1832 36 0 89.73 3.35E-14 55 995 51 6 10.59 1.45E-13 56 1112 37 1 46.11 3.17E-13 57 4037 36 1 44.87 7.30E-13 59 7551 32 0 79.76 1.05E-12 63 5481 30 0 74.78 5.90E-12 64 11827 36 2 22.43 8.18E-12 70 764 309 248 1.55 3.81E-10 72 3767 24 0 59.82 1.04E-09 81 497 19 0 47.36 7.74E-08 83 1116 44 13 4.22 1.09E-07 87 10627 42 12 4.36 2.14E-07 89 8469 17 0 42.37 4.34E-07 91 4866 25 3 10.39 5.13E-07 92 4057 30 6 6.23 5.67E-07 93 9797 22 2 13.71 5.72E-07 95 11756 19 1 23.68 9.27E-07 98 4141 49 20 3.05 1.94E-06 9Q 4022 23 3

161 Table A (Continued)

ini 7Sf> IS in 7 102 9110 15 0 37.39 2.43E-06 104 6956 20 2 12.46 2.70E-06 106 1034 31 8 4.83 4.33E-06 107 29 17 1 21.19 4.69E-06 108 3612 19 2 11.84 5.83E-06 111 7038 35 12 3.64 9.85E-06 112 6627 16 1 19.94 1.05E-05 116 8243 13 0 32.40 1.36E-05 121 10956 23 5 5.73 2.28E-05 122 2365 21 4 6.54 2.34E-05 123 11821 21 4 6.54 2.34E-05 129 592 22 5 5.48 4.33E-05 131 10380 22 5 5.48 4.33E-05 133 8450 26 8 4.05 7.37E-05 134 6537 11 0 27.42 7.64E-05 136 7181 11 0 27.42 7.64E-05 141 2662 10 0 24.93 1.81E-04 146 9156 10 0 24.93 1.81E-04 148 7579 12 1 14.96 2.56E-04 150 11838 12 1 14.96 2.56E-04 152 10615 14 2 8.72 4.07E-04 153 1099 9 0 22.43 4.28E-04 155 3261 9 0 22.43 4.28E-04 157 6017 9 0 22.43 4.28E-04 159 8347 9 0 22.43 4.28E-04 160 10253 9 0 22.43 4.28E-04 163 11029 11 1 13.71 5.62E-04 165 6208 13 2 8.10 7.95E-04 166 11580 13 2 8.10 7.95E-04 169 2173 8 0 19.94 l.OlE-03 170 3836 8 0 19.94 l.OlE-03 171 8476 8 0 19.94 l.OlE-03 182 12496 15 4 4.67 1.71E-03 185 7348 23 10 2.87 2.19E-03 187 1501 7 0 17.45 2.40E-03 188 4052 7 0 17.45 2.40E-03 190 5124 7 0 17.45 2.40E-03 194 11663 7 0 17.45 2.40E-03 195 586 17 6 3.53 2.61E-03 196 865 9 1 11.22 2.65E-03 199 7897 11 2 6.85 3.04E-03 201 8406 20 8 3.12 3.27E-03 209 895 6 0 14.96 5.68E-03 212 3248 6 0 14.96 5.68E-03 214 4604 6 0 14.96 5.68E-03 215 7325 6 0 14.96 5.68E-03 216 7802 6 0 14.96 5.68E-03 218 10020 6 0 14.96 5.68E-03 222 5249 8 1 9.97 5.70E-03 224 5313 26 14 2.31 5.75E-03 228 11393 26 15 2.16 6.96E-03 232 11285 14 5 3.49 8.71E-03 235 615 9 2 5.61 1.15E-02 238 12312 9 2 5.61 1.15E-02 239 484 11 3 4.57 1.16E-02 241 8489 11 3 4.57 1.16E-02 245 9015 7 1 8.72 1.21E-02 271 11638 5 0 12.46 1.34E-02 m mi5 5 Q UAh LMEJR 162 Table A (Continued)

4Sni 5 n 12 46 1 MF-O? 257 4515 5 0 12.46 1.34E-02 260 5978 5 0 12.46 1.34E-02 265 8791 5 0 12.46 1.34E-02 266 9641 5 0 12.46 1.34E-02 267 9679 5 0 12.46 1.34E-02 269 10444 5 0 12.46 1.34E-02 274 11417 33 23 1.79 1.43E-02 277 2242 11 4 3.43 1.81E-02 280 5482 14 6 2.91 2.10E-02 282 2457 8 2 4.99 2.20E-02 285 4564 12 5 2.99 2.50E-02 287 7040 12 5 2.99 2.50E-02 290 1388 19 11 2.15 2.53E-02 291 2858 17 9 2.35 2.69E-02 296 871 4 0 9.97 3.18E-02 302 2458 4 0 9.97 3.18E-02 303 3826 4 0 9.97 3.18E-02 304 4202 4 0 9.97 3.18E-02 305 4350 4 0 9.97 3.18E-02 307 4949 4 0 9.97 3.18E-02 308 5027 4 0 9.97 3.18E-02 309 5103 4 0 9.97 3.18E-02 311 5380 4 0 9.97 3.18E-02 314 6471 4 0 9.97 3.18E-02 315 6789 4 0 9.97 3.18E-02 318 7997 4 0 9.97 3.18E-02 321 8570 4 0 9.97 3.18E-02 324 10814 4 0 9.97 3.18E-02 339 8168 7 2 4.36 4.17E-02 341 3807 13 7 2.31 4.36E-02 342 36 6 1 7.48 4.68E-02 343 569 6 1 7.48 4.68E-02 344 987 6 1 7.48 4.68E-02 347 4174 6 1 7.48 4.68E-02 349 4695 6 1 7.48 4.68E-02 350 5753 6 1 7.48 4.68E-02 351 6805 6 1 7.48 4.68E-02 352 7356 6 1 7.48 4.68E-02 355 10299 6 1 7.48 4.68E-02 35 6 10429 6 ] 7.4R 4.6RF,-n2 a The number is not continuous as those with protein homologues are not included, b GLEAN model number is in the form of Gene:Jan06m300_GLEAN_XXXXX during inquiry in the C. cinerea genome annotations website. e Occurrence of the gene model is given by the summation of the counts of tags found Ikb upstream, d Fold difference is calculated by dividing the % abundance in mycelium by that in primordium. e A p-value of 0.05 in the Fisher Exact Test is used as confirmation of differential expression.

163 Table B Summary of the 135 (out of 696) differentially expressed genes without protein homologues in the primordial stage

Occurrence No.a GLEAN model b Fold difference ^ Fisher p-value ‘ Myc Pri

1 SOIA n 5(S6 908 9,8 1 UR-H-i

2 10846 2 476 190.96 9.30E-110

3 4267 0 349 560.05 1.02E-83

4 3623 3 291 77.83 2.32E-64

5 7980 0 226 362.67 1.44E-54

12 5232 0 135 216.64 1.25E-32

33 12504 0 51 81.84 7.69E-13

42 10837 29 136 3.76 2.14E-11

44 7546 1 50 40.12 3.71E-11

46 4206 26 126 3.89 5.34E-11

50 9811 0 43 69.00 9.34E-11

53 5233 88 256 2.33 1.58E-10

59 2622 112 292 2.09 1.58E-09

61 3656 1 42 33.70 2.23E-09

65 7458 10 73 5.86 4.09E-09

66 4094 11 75 5.47 4.46E-09

68 8002 0 35 56.17 5.95E-09

69 23 1 40 32.09 6.99E-09

73 6129 3 47 12.57 1.96E-08

77 2499 19 91 3.84 3.25E-08

87 6097 4 46 9.23 1.44E-07

90 11371 0 28 44.93 2.36E-07

94 9017 1 32 25.68 4.12E-07

97 9763 2 35 14.04 8.29E-07

105 6998 0 25 40.12 1.36E-06

109 9564 36 116 2.59 2.20E-06

133 1607 1 26 20.86 1.30E-05

135 8799 4 36 7.22 1.50E-05

142 652 6 41 5.48 2.01E-05

151 7443 0 20 32.09 2.96E-05

153 1574 2 27 10.83 3.85E-05

158 7962 11 52 3.79 5.33E-05

164 10036 506 865 1.37 6.18E-05

170 10818 1 22 17.65 7.10E-05

173 8625 17 63 2.97 9.87E-05

181 6714 60 146 1.95 1.28E-04

184 10060 529 890 1.35 1.43E-04

197 2718 3 27 7.22 2.65E-04

201 2380 13 51 3.15 3.21E-04

205 28 23 72 2.51 3.58E-04

219 594 0 14 22.47 5.76E-04

220 11876 0 14 22.47 5.76E-04

225 11544 1 18 14.44 6.59E-04

235 8665 5 29 4.65 8.00E-04

236 9257 5 29 4.65 8.00E-04

237 9908 5 29 4.65 8.00E-04

245 3661 0 13 20.86 1.05E-03

246 8540 0 13 20.86 1.05E-03

248 9324 26 74 2.28 1.09E-03

252 2737 27 75 2.23 1.23E-03

259 328 7 32 3.67 1.80E-03

262 1180 3 22 5.88 1.83E-03 261 mn 6 2Q ^

164 Table A (Continued)

767 IS n 17. 19 9,6 1 (MR-m

268 4502 0 12 19.26 1.94E-03

285 543 2 18 7.22 2.66E-03

286 1798 2 18 7.22 2.66E-03

288 312 13 44 2.72 2.90E-03

289 2263 27 71 2.11 2.94E-03

291 10847 3 21 5.62 2.99E-03

308 1767 0 11 17.65 3.61E-03

309 6301 0 11 17.65 3.61E-03

310 9813 0 11 17.65 3.61E-03

317 9435 1 15 12.04 3.67E-03

319 9574 11 39 2.84 3.80E-03

336 8023 4 22 4.41 4.92E-03

340 7619 8 31 3.11 5.54E-03

341 7906 21 57 2.18 5.82E-03

356 11237 6 26 3.48 6.77E-03

373 2667 0 10 16.05 6.79E-03

374 2680 0 10 16.05 6.79E-03

375 5584 0 10 16.05 6.79E-03

376 5927 0 10 16.05 6.79E-03

377 7812 0 10 16.05 6.79E-03

378 9450 0 10 16.05 6.79E-03

379 10245 0 10 16.05 6.79E-03

380 10522 0 10 16.05 6.79E-03

381 11361 0 10 16.05 6.79E-03

387 3274 2 16 6.42 7.31E-03

388 9847 2 16 6.42 7.31E-03

396 7254 3 19 5.08 7.99E-03

397 10856 3 19 5.08 7.99E-03

400 4198 13 41 2.53 8.23E-03

403 4794 20 54 2.17 9.17E-03

413 1857 1 12 9.63 1.06E-02

414 6397 1 12 9.63 1.06E-02

415 8767 24 60 2.01 L.LOE-02

425 5936 4 20 4.01 1.20E-02

431 11545 2 15 6.02 1.21E-02

434 1405 3 17 4.55 1.27E-02

435 6406 3 17 4.55 1.27E-02

446 949 0 9 14.44 1.29E-02

447 2363 0 9 14.44 1.29E-02

448 6692 0 9 14.44 1.29E-02

449 8806 0 9 14.44 1.29E-02

452 5610 3 18 4.81 1.30E-02

479 968 1 11 8.83 1.80E-02

480 4484 1 11 8.83 1.80E-02

482 10618 10 32 2.57 1.82E-02

486 11546 7 25 2.87 1.99E-02

491 5452 3 16 4.28 L99E-02

492 8542 3 16 4.28 1.99E-02

498 2486 2 14 5.62 2.01 E-02

526 2736 0 7 11.23 2.38E-02

527 5504 0 7 11.23 2.38E-02

528 7772 0 7 11.23 2.38E-02

529 10184 0 7 11.23 2.38E-02

530 11362 0 7 11.23 2.38E-02

531 12410 0 7 11.23 2.38E-02

563 969 0 8 12.84 2.45E-02

564 4891 0 8 12.84 2.45E-02

565 9660 0 8 12.84 2.45E-02 222S Q 8 12M 2A5E^ 165 Table B (Continued)

6881 7 94 7 75 7 88F.-0?. 576 7846 7 24 2.75 2.88E-02 589 2055 1 10 8.02 3.07E-02 590 7848 1 10 8.02 3.07E-02 611 11288 2 13 5.22 3.31E-02 623 11918 38 78 1.65 3.89E-02 665 3624 0.5 6 9.63 4.28E-02 666 4129 0.5 6 9.63 4.28E-02 667 5989 0.5 6 9.63 4.28E-02 668 8130 0.5 6 9.63 4.28E-02 669 9540 0.5 6 9.63 4.28E-02 670 10046 0.5 6 9.63 4.28E-02 671 10807 0.5 6 9.63 4.28E-02 672 10944 0.5 6 9.63 4.28E-02 673 11888 0.5 6 9.63 4.28E-02 678 400 4 17 3.41 4.42E-02 679 3247 4 17 3.41 4.42E-02 681 11320 12 32 2.14 4.75E-02 688 8649 3 14 3.74 4.85E-02 695 3128 6 21 2.81 4.93E-02 LL45S m 2S L25 4.9•制 2 a The number is not continuous as those with protein homologues are not included. b GLEAN model number is in the form of Gene:Jan06m300_GLEAN_XXXXX during inquiry in the C. cinerea genome annotations website. e Occurrence of the gene model is given by the summation of the counts of tags found Ikb upstream, d Fold difference is calculated by dividing the % abundance in primordium by that in mycelium e A p-value of 0.05 in the Fisher Exact Test is used as confirmation of differential expression.

166

CUHK Libraries 圓圓國1_11111 004506585