Supplementary materials for

Kotlobay et al. 10.1073/pnas.1803615115

Supplementary Methods

Molecular biology

Identification of the nnluz gene. To achieve strong constitutive expression of cDNA library from nambi, we constructed the GAP-pPic9K vector. The plasmid was created based on pPic9K vector (Invitrogen) by replacing inducible alcohol oxidase promoter AOX1 and alpha-factor signal sequences with the glyceraldehydes-3-phosphate dehydrogenase (GAP) promoter from Pichia pastoris. GAP promoter sequence was obtained from the vector pGAPZA by digestion with BglII, completion of overhanging nucleotides with Klenow fragment and subsequent digestion with EcoRI. pPic9K vector was digested with AatII, blunted with Klenow fragment, subsequently digested with EcoRI and used as a backbone for cloning of GAP promoter fragment. Unique restriction sites BamHI, Eco53kI, SacI were further introduced by PCR with primers 5’-AATTGGATCCCAGAGGCTCGC-3’ and 5’-GGCCGCGAGCTCTGGGATCC-3’ and cloning with EcoRI/NotI sites.

Total RNA from N.nambi mycelium was prepared according to the protocol from Chomczynski and Sacchi (1987) [ref. (1)] and the cDNA library was constructed with SMART PCR cDNA Synthesis Kit (Clontech). Amplified cDNA was cloned into the GAP-pPic9K vector using BamHI/NotI restriction sites. pPic9K vector spontaneously generate multiple insertion events after linearization, as linear DNA can generate stable transformants of Pichia pastoris via homologous recombination between the transforming DNA and regions of homology within the genome (2–4). Multiple insertion events occur spontaneously 10-100 times less often than single insertion events (GAP-pPic9K vector manual, ThermoFisher).

Extracted cDNA plasmid library was linearized by AvrII restriction site and used for transformation of Pichia pastoris GS115 strain by electroporation. Ten parallel transformation were performed with 1 ug of DNA per transformation according to the lithium acetate and dithiothreitol protocol described in [ref. (5)]. Electroporated cells were spread on 40 plates containing RDB medium (1 M sorbitol, 2% (w/v) glucose, 1.34 % (w/v) yeast nitrogen base (YNB), 0.005 % (w/v) non-essential amino acids, 0.00004 % (w/v) biotin, 2 % (w/v) agar). The final N. nambi cDNA library contained about 1 million yeast colonies. Each plate with transformants was sprayed with a 6.5 ug ml−1 solution of fungal luciferin (450-500 ul) and analysed by IVIS Spectrum CT (PerkinElmer). Genomic DNA from glowing colonies was isolated as described in [ref. (6)], amplified with primers 5’- CCTAGGAAATTTTACTCTGCTGGA-3’ and 5’-GCAAATGGCATTCTGACATCC-3’ and sequenced by Sanger sequencing. All sequenced glowing clones contained the same sequence corresponding to the sequence of the nnluz gene. This sequence was also found in Neonothopanus nambi genome in the contig 8227_4M_c489.

Reconstruction of the bioluminescent pathway in Pichia pastoris. Reconstruction of the bioluminescent pathway was performed by co-integration of genes coding for luciferase (nnluz), hispidin-3-hydroxylase (h3h), hispidin synthase (hisps) from N. nambi, and 4’-phosphopantetheinyl transferase (npgA) from Aspergillus nidulans into the genome of Pichia pastoris GS115 strain. To discriminate clones with successful genome integration, each gene was supplied with a unique selective marker. Vectors pGAP-Kan, pGAP-Bsd and pGAP-Hyg were constructed based on pGAPZ vector (Invitrogen) by

www.pnas.org/cgi/doi/10.1073/pnas.1803615115 replacement of zeocin resistant gene (Sh bl) with the kanamycin (aphA1), blasticidin (bsd) or hygromycin (hph) resistant genes, respectively. The aphA1 gene was amplified from the pET28 (Novagen) vector using the primer pair 5’- AAGGGAACATGTCTcatattcaacgggaaacgtcttg-3’ (PciI site underlined) and 5’- AAAAACACGAAGTGTTAGAAAAACTCATCGAGCATCAAA-3’ (DraIII site underlined) and cloned into pGAPZ resulting in pGAP-Kan plasmid. bsd gene was amplified from the pPic6αA vector (Invitrogen) with the primer pair 5’- TCCTCACCATGGCCAAGCCTTTGTCTCAAG-3’ (NcoI site underlined) and 5’- ACAGACCACGAAGTGTTAGCCCTCCCACACATAACCA-3’ (DraIII site underlined) and cloned into the corresponding sites of the recipient plasmid resulting in pGAP-Bsd. hph gene was amplified from the pcDNA3.1/Hygro(+) vector (Invitrogen) using the primer pair 5’- AACCCATCATGAaaaagcctgaactcaccg-3’ (PagI site underlined) and 5’- TCTTTAGCGCTTTAttcctttgccctcggacg-3’ (AfeI site underlined). pGAPZ was digested with DraIII, the overhanging nucleotides were removed with Klenow fragment, then linear DNA was subsequently digested with NcoI and used for cloning of PCR product HPH resulting in pGAP-Hyg. To generate GAP-pPic9 vector containing only HIS selective marker, GAP- pPic9K vector was digested with XhoI and BbvCI sites to remove the kanamycin resistance cassette. The ends of the DNA were blunted with Klenow fragment and re-ligated to generate the desired construct.

The coding sequence of nnLuz was amplified from P. pastoris genomic DNA sample using primers 5’-CCTCCTTCGAAATGCGCATTAACATTAGCCTCTC-3’ (BstBI site underlined) and 5’-TCTTTGTCGACTTACTTGGCGTTTTCTACAATCTTTC-3’ (SalI site underlined), and cloned into the corresponding sites of the pGAP-Kan vector to generate the nnLuz/pGAP_Kan. The h3h gene sequence of hispidin-3-hydroxylase was predicted bioinformatically as described below. Plasmids H3H/GAP-pPic9 was constructed by PCR amplification of the h3h gene from cDNA of the N. nambi using specific primers 5’- TGTTGGGATCCGCATCGTTTGAGAATTCTCTAAG-3’ (BamHI site underlined) and 5’- ATATAGCGGCCGCTCAGGCAGAATTAGAACTTCTTAGGA-3’ (NotI site underlined) and cloned into the corresponding sites of the GAP-pPic9 vector. Both nnLuz/pGAP_Kan and H3H/GAP-pPic9 were linearized with restriction enzyme AvrII and transformed into P. pastoris GS115 by electroporation as described in [ref. (5)]. Transformed cells were selected in RDB plates without histidine (1 M sorbitol, 2% (w/v) glucose, 1.34 % (w/v) yeast nitrogen base (YNB), 0.005 % (w/v) non-essential amino acids, 0.00004 % (w/v) biotin, 2 % (w/v) agar) supplemented with 250 ug ml-1 G418. Colonies were sprayed with a 6.5 ug ml−1 solution of hispidin and analyzed by Fusion-Pulse.7 (Vilber Lourmat) equipped with DARQ-7 camera. The brightest colonies were isolated and cultured by shaking for 72 h at 30°C in YPD medium (2% peptone, 1% yeast extract, 2 % glucose).

Clone nnLuz-H3H with the highest specific activity of both enzymes was used for the insertion of genes polyketide synthase (hisps) from N. nambi and 4’-phosphopantetheinyl transferase (npgA) from Aspergillus nidulans. The predicted 5.1 kb hisps gene without introns was ordered as a linear synthetic DNA and cloned into pGAPZ vector using Gibson assembly. Correctness of the recombinant hisps gene was verified by Sanger sequencing. Gene of 4’-phosphopantetheinyl transferase npgA was also ordered as linear synthetic DNA with flanking BstBI/SpeI restriction sites at the ends and cloned into the corresponding sites of pGAP-Bsd. The resulting plasmids NpgA/pGAP-Bsd and HispS/pGAPZ were linearized by AvrII restriction site. Both constructs were electroporated into nnLuz-H3H strain obtained above. Transformed cells were grown on YPDS plate (1 M sorbitol, 2% peptone, 1% yeast extract, 2 % glucose, 2% agar) supplemented with 100 ug ml-1 zeocin and 200 ug ml-1 blasticidin S. Colonies were sprayed with a 3.3 mg ml−1 solution of caffeic acid and analyzed by Fusion-Pulse.7 (Vilber Lourmat) equipped with DARQ-7 camera. The brightest clone nnLuz-H3H-HispS-NpgA was incubated with shaking for 72 h at 30°C in YPD medium (2%

1 peptone, 1% yeast extract, 2 % glucose) supplemented with 0.75% caffeic acid. This allowed direct observation of bioluminescence with eyes or by a camera (Fig. 3A).

The final step in production of an autonomously bioluminescent strain of Pichia pastoris was introduction of additional genes for biosynthesis of caffeic acid from tyrosine. Genes of tyrosine ammonia-lyase from Rhodobacter capsulatus, 4-hydroxyphenylacetate 3- monooxygenase component HpaB and 4-hydroxyphenylacetate 3-monooxygenase component HpaC from Escherichia coli were chemically synthesized with insertion the BstBI restriction site at the 5’ end and SpeI restriction site at the 3’ end required for cloning into the pGAP-Hyg into the corresponding sites. Constructions RcTAL/pGAP-Hyg, HpaB/pGAP-Hyg and HpaC/pGAP-Hyg were linearized by AvrII restriction site and electroporated into the brightest nnLuz-H3H-HispS-NpgA strain obtained above. Transformed cells were grown on YPDS plate (1 M sorbitol, 2% peptone, 1% yeast extract, 2 % glucose, 2% agar) supplemented with 150 ug ml-1 hygromycin B and 150 ug ml-1 L-tyrosine and analyzed by Fusion-Pulse.7 (Vilber Lourmat) equipped with DARQ-7 camera. We detected around 10 % autonomously glowing colonies which corresponds to the spontaneously frequency of multiple plasmid integration into genome of Pichia pastoris (pPIC9K. A Pichia Vector for Multicopy Integration and Secreted Expression). To isolate glowing clones from the total pool colonies were transferred to another YPD plate (2% peptone, 1% yeast extract, 2 % glucose, 150 ug ml-1 L-tyrosine, 2% agar) supplemented with higher concentration of hygromycin B (300 ug ml-1). Surviving clones were analyzed by Fusion-Pulse.7 (Vilber Lourmat) equipped with DARQ-7 camera (Fig. S12).

Whole genome sequencing of the engineered glowing yeast strain. We sequenced genome of the autoluminescent strain to confirm insertion of the genes. Genomic DNA was extracted and purified according to the protocol from the kit for expression of recombinant proteins in Pichia pastoris (Pichia Expression Kit manual, Invitrogen, cat. K1710-01). Library preparation was performed with TruSeq Nano DNA Low Throughput Library Prep Kit (Illumina, Cat. #20015964). We used multiplexed paired-end whole genome sequencing on Illumina NovaSeq 6000 device and NovaSeq 6000 S2 Reagent Kit (Illumina, Cat.#20012861) according to manufacturer’s manual. Average length of the DNA fragments was about 350 bp before ligation of sequencing adapters. Raw sequencing reads were mapped to P. pastoris GS115 reference genome and additional sequences of all seven inserted constructs by BWA-MEM algorithm (7). Coverage calculations have been performed with bedtools. Copy number estimations for each of the inserted genes are summarized in Table S3. Alignments of Pichia pastoris genome sequencing reads are available at Figshare (doi: 10.6084/m9.figshare.6738953).

Expression in E. coli and purification of nnLuz. The coding sequence of nnLuz was amplified using the primers 5’-GGATCCCGCATTAACATTAGCCTCTC-3’ (BamHI site underlined) and 5’-AAGCTTTCACTTGGCGTTTTCTACAATCTTTC-3’ (HindIII site underlined), and cloned into the pET-23b vector (Novagen). E. coli cells BL21 DE3 CodonPlus (Stratagene) transformed with the plasmid nnLuz/pET23 were cultivated with vigorous shaking at 37°C in LB medium containing 200 ug/ml ampicillin and 34 ug/ml chloramphenicol. When the culture reached an optical density of 0.6 at 600 nm, luciferase synthesis was induced with 0.4 mM IPTG. After induction, the cultivation was continued for 16 h at 25°C. Expression of nnLuz in E. coli resulted in accumulation of most of the recombinant protein in insoluble inclusion bodies. To purify the protein, cells were harvested by centrifugation at 4 °C. The cells were suspended in 50 mM Tris-HCl, 150 mM NaCl, 5 mM EDTA, 10 mM 2-ME, pH 8.0, and disrupted by sonication, using a Sonics VibraCell ultrasonic processor (Sonics and Materials Inc). After a 20-minute centrifugation at 10,000xg the pellet (inclusion bodies) was dissolved in 8M Urea, 50mM Tris-HCl, 1 mM TCEP, pH 8.0 buffer (buffer A) for 12 hours at room temperature. The solution was centrifuged at 30, 000 x g for 30 min at room temperature and supernatant was applied onto Ni Sepharose excel

2 column (GE Healthcare Life Sciences), equilibrated with buffer A. After washing the column subsequently with buffer A, then with 0.5M NaCl in buffer A and with 10 mM imidazole in buffer A, luciferase was eluted with 250 mM imidazole in buffer A. Pooled fractions were buffer-exchanged on PD-10 column (GE Healthcare) into 10 mM Na-Acetate, 6M Urea pH 5.5 buffer. In vitro refolding of luciferase was performed by 50-fold dilution into 20 mM MES- NaOH, 5% glycerol, 1 mM DTT pH 6.2 buffer at 10oC. After centrifugation at 30,000xg for 30 min at 4C supernatant was concentrated by ultrafiltration using 10 kDa membrane (Ultracel, Millipore) and stored at -70C. Deletion of the transmembrane region did not lead to notable solubilization of nnLuz in E. coli cytoplasm.The purified and in vitro refolded nnLuz was soluble and stable in at slightly acidic solutions (around pH 6.0) without detergents. However, for the luminescent activity of recombinant nnLuz the presence of detergents was strictly required.

Expression in Pichia pastoris. The coding sequence of nnLuz was amplified using primers 5’-CCTCCTTCGAAATGCGCATTAACATTAGCCTCTC-3’ (BstBI site underlined) and 5’- TCTTTGTCGACCTTGGCGTTTTCTACAATCTTTC-3’ (SalI site underlined) from nnLuz/pGAP_Kan and cloned into the corresponding sites pGAPZA vector (Invitrogen) in frame with 6xHis tag. Pichia pastoris cells were transformed with nnLuz/pGAPZA as described in (5). Transformed cells were grown on YPD medium with 100 ug ml-1 zeocin, 2% agar. Colonies were sprayed with a 6.5 ug ml−1 solution of fungal luciferin and analyzed by IVIS Spectrum CT (PerkinElmer). The brightest colonies were isolated and cultured with shaking for 72 h at 30°C in YPD medium (2% peptone, 1% yeast extract, 2 % glucose). Cells were harvested by centrifugation at 4°C and suspended in 100 ml of buffer (100 mM sodium phosphate buffer, 100 mM KCl, 5 mM EDTA, 2 mM TCEP, 1 mM PMSF, pH 7.4). Cells were disrupted by APV2000 SPX (SPX FLOW) (20 times, 600 bar), centrifuged at 5,000xg (4°C) for 15 min. The supernatant was collected and centrifuged again at 140,000xg (4°C) for 1 hour. The supernatant was discarded and the pellet (microsomal fraction) containing the nnLuz was obtained. The microsomal fraction was suspended in water and stored at -70°C.

Expression in HEK293NT, HeLa and СТ26 cell lines. The coding sequence for nnLuz was amplified using the primers 5’- AGTCGCTAGCACCGCCATATGCGCATTAACATTAGCCTCT-3’ (NheI site underlined) and 5’-AGTCAGATCTТСАCTTGGCGTTTTCTACAATCT-3’ (BglII site underlined), and cloned into the pKatushka2S-C1 vector (Evrogen) instead of Katushka2S coding sequence. HEK293NT, HeLa and СТ26 cell lines were transfected with nnLuz/C1 using FuGENE HD transfection reagent (Promega). Transfected cells were grown in DMEM medium (Paneco) supplemented with 10% fetal bovine serum (HyClone), 4 mM L-glutamine, 10 u ml-1 penicillin, 10 ug ml-1 streptomycin, at 37°C, 5% CO2. 24 hours after transfection fungal luciferin was added (650 ug/ml). Luminescence was detected with the Leica DM6000 microscope (Leica).

Extraction of native nnLuz from Neonothopanus nambi mycelium. 10 g of Neonothopanus nambi mycelium was washed with water, dried by removing excess of water and frozen in liquid nitrogen. Frozen material was ground in a mortar and suspended in 100 ml of buffer (100 mM sodium phosphate buffer, 5 mM EDTA, 5 mM TCEP, 0.5 mM PMSF, pH 7.4). After a 30 minute incubation in ice-water, the suspension was sonicated for 10 minutes using Sonics VibraCell ultrasonic processor (Sonics and Materials inc), centrifuged at 5,000 g (4°C) for 15 min. The supernatant was then collected and centrifuged again at 140,000 g (4°C) for 1 hour. The supernatant was discarded and the pellet (microsomal fraction) containing the nnLuz was suspended in water by a Dounce grinder and stored at - 70 °C.

Denaturing polyacrylamide gel electrophoresis and amino acid sequence analysis. SDS-PAGE was performed according to [ref. (8)] using a 10-25% gradient polyacrylamide

3 gel. Gel staining was done according to the silver staining protocol from (9), or using a standard Coomassie G250 staining. Protein bands were excised from the gel and were identified by LC-MS. In-gel trypsinolysis was done according to [ref. (10)]. Mass spectrometry analysis was performed on the Ultimate 3000 Nano LC System (Thermo Fisher Scientific), connected to a Q Exactive HF mass spectrometer (Thermo Fisher Scientific). For data analysis, we used Mascot software (Matrix Science).

Western blotting. After SDS-PAGE the proteins were electrophoretically transferred onto a nitrocellulose membrane. After blocking with 5% fat-free dry milk in 20 mM Tris-HCl, 150 mM NaCl, 0.05% Tween 20, pH 8.0 buffer the membrane was incubated with the mouse anti- HisTag antibody conjugated to horseradish peroxidase (Abcam). Immobilon Western kit (Millipore) was used for visualization. Signal was detected by Vilber Lourmat Fusion system.

Luminescence measurements. Luminescence spectra were recorded using Varian Cary Eclipse fluorescence spectrophotometer. For the bioluminescence spectra, 1 cm agar square containing cultured Neonothopanus nambi mycelium was overlaid with 50 ul of 20 uM solution of luciferin in water and was placed in a cuvette facing emission PMT. For the chemiluminescence spectra of Pichia pastoris microsomal fraction, 50 ul of microsomal suspension was mixed with 1 ml of 100 mM sodium phosphate, 0.1% DDM, pH 7.5 buffer. Obtained data were corrected for the kinetics of luminescence intensity during spectra acquisition. pH and temperature dependence of recombinant nnLuz chemiluminescence. For measuring the pH dependence of purified nnLuz chemiluminescence, 5 ul of luciferase solution (obtained by dissolving Pichia pastoris microsomal fraction as described in the above section) was mixed with 100ul of 200 mM buffer of indicated pH, 0.1% DDM and was placed into Promega Glomax 20/20 luminometer. Immediately after that, 25ul of 25 uM fungal luciferin in 2 mM formic acid, 0.1% DDM was injected and the light intensity was recorded at 1 sec intervals. The results are shown in Fig. S10. For the temperature dependence of purified nnLuz stability, 25ul of luciferase solution (obtained by dissolving Pichia pastoris microsomal fraction as described in the above section) in 20 mM MES-NaOH, 5% glycerol, 1 mM DTT pH 6.2 buffer was incubated at indicated temperature for 10 min. After that, luciferase activity was measured with Promega Glomax 20/20 luminometer by injecting 25ul of 15 uM fungal luciferin in 2mM formic acid, 0.1% DDM into a vial containing 5 ul of luciferase sample in 100ul of 200 mM sodium phosphate buffer, 0.5 M Na2SO4, 0.1% DDM, pH 8.0. The results are shown in Fig. S10.

Сonversion of hispidin into luciferin in vitro. For this experiment we used cells lysate of Pichia pastoris strain carrying a genomic copy of h3h gene. 5ul of the lysate was added to 40 ul of the PBS buffer (pH 7.4) containing 1 mM NADPH and 1 uM hispidin. After 5 min of incubation at 25°C 100 ul of assay buffer (100 mM sodium phosphate, 0.1% DDM, pH 7.5 buffer) and 0.5 ul Pichia pastoris carrying genomic copies of nnluz gene microsomal fraction were added to induce light emission. Pichia pastoris wild type cells were used in the same experiment as a negative control. Kinetics of bioluminescence was registered by a custom made luminometer (Oberon-K, Russia), results are shown in Fig. S21.

In vivo experiments

Heterologous expression of nnluz gene in the nervous system of developing Xenopus laevis embryo. The nnluz gene was subcloned into pCS2+ vector by NheI (blunted)/EcoRI (blunted) and SalI/XhoI cloning. Resulted plasmid was linearized by Acc65 restriction site

4 and used in in vitro transcription to obtain nnLuz mRNA, using SP6 mМessage mМachine Kit (Ambion). mRNA was purified with the CleanRNA Standard Kit (Evrogen) and injected in both blastomeres (~500 pg per blastomere) of two-cell-stage Xenopus leavis embryo. RLD (rhodamine lysin dextran) was used as a tracer. The fungal luciferin solution (660 ug/ml in DMSO) was injected into the blastocoel cavity of gastrula-stage embryos (stage 10.5). Luminescence was detected after luciferin injection with the DM6000 microscope (Leica).

Whole-body imaging in mice. Mus musculus carcinoma CT26 cells were injected subcutaneously into the back of a female 10-weeks-old mouse (BALB/c strain, animal facility of Pirogov Russian National Research Medical University, Moscow, Russia). Cells expressing nnLuz were injected into the left side of the back, and cells expressing Photinus pyralis luciferase were injected into the right side. Before injection, the mouse was anesthetized by isoflurane/air (2,5/97,5 %) inhalation anesthesia. Ten minutes later, a mixture of fungal (0.5 mg) and firefly (0.5 mg) luciferin solutions was administered via intraperitoneal injection. Luminescence was visualized by IVIS Spectrum CT (PerkinElmer). The mouse was kept under sterile barrier conditions, usual light/dark cycle, and was given food and water ad libitum. The animal was treated humanely with regard to alleviation of suffering. All experiments were approved by the local ethical committee of RNRMU and were carried out in accordance to EU Directive 2016/63/EU.

Determination of LC50 of fungal luciferin to HEK293T cells. HEK293T cells were seeded into 96-well plate, supported with 200 ul of full DMEM (described previously), 5 000 cells per well. Cells were incubated for 24 hours at 37ºC with 5% CO2, then fungal luciferin in DMSO was added to each well. Concentration of fungal luciferin was 0 ug/ml, 10 ug/ml, 25 ug/ml, 50 ug/ml, 100 ug/ml, 250 ug/ml, 500 ug/ml, 1000 mg/ml, each concentration of luciferin was added to 3 wells for preparing sample replicas. The concentration of DMSO was equal in each well (1%). Cells with luciferin and DMSO were incubated for 48 hours and then viability test using alamarBlue (ThermoFisher) was performed. The results are shown in Fig. S18.

Transfection efficiency calculation. Approximately 300 000 HEK293T cells were seeded to 35 mm Fluorodish (World Precision Instruments). 24 hours after (at the 80% fo confluency) the cotransfection with FuGENE HD (Promega) of constructions nnLuz-C1 and pTurboFP635-C in equimolar quantities was performed due to the recommendation of the manufacturer. 6 hours after the transfection the growing medium was changed to fresh aliquot of full DMEM and after overnight incubation at 37°C and 5% CO2 the microscopy was carried out. The pictures of the cells (20x objective) were taken in 3 channels: brightfield, fluorescence in the red channel (ex. 560 nm, em. 607-682 nm), luminescence in the green channel (>505 nm). 11 sets of pictures of different areas with the cells were taken in total. The amount of all cells, cells visible in green channel and cells visible in red channel was counted manually for each set of pictures and the level of transfection was counted as average of all data (Fig. S19).

Relative growth rates of wild type strain of P. pastoris (GS115) versus autoluminescent strain of P. pastoris. Cells were grown in liquid YPD medium with 100 ug ml-1 of zeocin overnight. The next day, cell suspension was diluted with fresh YPD medium up to optical density of 0.07 at 600 nm and cultivated with shaking for 21.5 h at 30°C. The aliquots of samples were measured at time points of 0, 2.5, 5, 6.5, 8, 10, 13.5, 14.5, 16, 17.5, 20 and 21.5 hours. Each cell line was analyzed in three biological replicates and each sample was measured in triplicates (Fig. S20).

Bioinformatic analysis of evolution

Gene prediction. For those genomes lacking a gene prediction, Augustus v3.1 [ref. (11)] was used to obtain the predictions using the parameters pre-calculated for Coprinus

5 cinereus. We then used blastp to locate the proteins that belonged to the cluster. The proteins were then manually curated to ensure they were correctly predicted. Missing proteins were also searched for using exonerate (12) and added to the proteome if found. A complete list of proteomes used can be found in Table S1.

Species tree reconstruction. First a blast search was performed using each gene encoded in Neonothopanus nambi against a database containing 57 complete proteomes of species. The blast results were first filtered (overlap > 0.3 and e-value < 1e-05) and then scanned in order to obtain those proteins that contained a single hit in at least 40 of the species. 36 proteins fulfilled this requirement. Muscle v3.8 [ref. (13)] was then used to align each of the 36 protein families, trimal v1.4 [ref. (14)] was used to filter the alignment (- gt of 0.5) and then the alignments were concatenated into a single multiple sequence alignment. RAxML (version 8.0.3) (15) was then used to reconstruct the species tree using the PROTGAMMALG model.

Gene trees reconstruction. For each of the proteins encoded in the cluster of Neonothopanus nambi, a blast search was performed against the complete database containing the 57 Agaricales genomes. Blast results were filtered using an overlap threshold of 0.5 and an e-value threshold of 1e-05. The number of homologs taken was limited to the 100 closest homologs. The homologous sequences were then aligned using three different alignment algorithms (Muscle v3.8 [ref. (13)], Mafft v6.814b [ref. (13)] and KALIGN 2.04 [ref. (16)]). Mafft was used with the --auto option while Muscle and Kalign were used with default parameters. Sequences were aligned in forward and in reverse (17). A consensus of the six resulting alignments was obtained using M-coffee (18) and then this alignment was filtered using trimAl v1.4 (-gt 0.1 and -cs 0.16667) (14). The trimmed alignment was then used to reconstruct the gene tree. First a model selection process was performed reconstructing a neighbour joining tree using BIONJ(19) and calculating the likelihoods using seven different model (JTT, WAG, MtREV, VT, LG, Blosum62, Dayhoff). The model with the best likelihood according to the AIC criterion (20) was then used to reconstruct a maximum likelihood tree using phyml v3.0 [ref. (21)]. Finally, a bootstrap analysis was run using 100 repetitions.

Gene trees were visualized using ETE v3.0 [ref. (22)]. Nodes with bootstraps below 50 were collapsed as they were deemed unreliable. Gene trees were rooted based on the species tree reconstructed above. As the age of the duplication event that occurred in the luciferase ancestor was not clear we added four additional homologs found in Eurotiomycetes (Ascomycota) to serve as outgroups.

Mapping of duplication and loss events. ETE v3.0 [ref. (22)] was used to analyse the gene trees. Duplications and speciation events were detected using a species-overlap algorithm which, for each node, checks whether there are overlapping species at either side of the node. If such overlap exists, the node is considered a duplication node; if not, it is considered a speciation node. The duplication node that led to the formation of luz, h3h and hisps was mapped onto the species tree by assuming that it occurred in the common ancestor of the species found at both sides of the node. Loss events were then detected by assuming that if all the members of a clade had lost the copy of the protein in question, then this loss should have occurred at the base of the clade.

Reconstruction of figure showing gene order. The nnLuz and nnH3H proteins for Neonothopanus nambi were located in a single contig that contained 8 proteins in total. Blast searches were performed from these 8 proteins to all the other genomes of interest. In the image, each square represents a gene that is found in the Neonothopanus nambi luciferase contig and that has a homolog in the other species. In grey are genes whose involvement in luciferin biosynthesis is unknown, while the coloured genes represent the luciferase (luz), hispidin-3-hydroxylase (h3h), caffeylpyruvate hydrolase (cph) and hispidin synthase (hisps)

6 genes. The thinner lines represent genes that are found in other species but not in Neonothopanus nambi. The green lines are coloured to indicate homology between the different species while the black ones indicate no homolog was found of these genes in the region of interest.

7 Supplementary Figures

Figure S1. Pichia pastoris cells transformed with Neonothopanus nambi cDNA library and sprayed with a solution of fungal luciferin: ambient light image (left figure in each row), bioluminescence image (middle figure) and merged image (right figure in each row). (A) Agar plate with yeast colonies. (B) Part of an agar plate with luminescent yeast colonies. (C) Agar plate with single yeast colonies grown after extraction of luminescent colonies from the plate shown in (B). Color shows the relative intensity of light emission from the colonies.

8

Figure S2. Alignment of protein sequences of luciferases from various fungal species. This alignment is also available at Figshare (doi: 10.6084/m9.figshare.6738953.v2).

9

Figure S3. Proposed hispidin biosynthesis in Polyporus schweinizii from tyrosine and phenylalanine. Scheme adapted from the reference (23).

10

Figure S4. Gene tree reconstructed for the luciferase. (A) Gene tree of the luciferase protein family. Leaves shadowed in gray indicate sequences of the bioluminescent species. Branches painted in red indicate duplication events while branches in blue indicate speciation events. Bootstrap supports are indicated by the thickness of the branch. Branches with a bootstrap support below 50 have been collapsed. The square made of dashes indicates species outside Agaricales that were added to properly root the tree. (B) Agaricales species tree showing the evolutionary history of the luciferase. “D” indicates where the duplication happened that led to the formation of luciferase. Red crosses indicate where the luciferase was lost. Scales in both trees estimate number of substitutions per site.

11

Figure S5. Gene tree reconstructed for the hispidin-3-hydroxylase. (A) Gene tree of the hispidin-3- hydroxylase. Leaves shadowed in gray indicate sequences of the bioluminescent species. Branches painted in red and blue indicate duplication and speciation events, respectively. Bootstrap supports are indicated by the thickness of the branch. Branches with a bootstrap support below 50 have been collapsed. Leaf names that contain an “x” followed by a number indicate collapsed clades and the number indicates the number of leaves the collapsed clade contained. The leaves preceding the duplication have been pruned for the image. (B) Agaricales species tree showing the evolutionary history of the hispidin-3-hydroxylase. “D” indicates where the duplication happened that led to the formation of hispidin-3-hydroxylase. Red crosses indicate where the hispidin-3- hydroxylase was lost. Scales in both trees estimate number of substitutions per site.

12

Figure S6. Gene tree reconstructed for the hispidin synthase protein family. (A) Gene tree of the hispidin synthase. Leaves shadowed in gray indicate sequences of the bioluminescent species. Branches painted in red and blue indicate duplication and speciation events, respectively. Bootstrap supports are indicated by the thickness of the branch. Branches with a bootstrap support below 50 have been collapsed. Leaf names that contain an “x” followed by a number indicate collapsed clades and the number indicates the number of leaves the collapsed clade contained. The leaves preceding the duplication have been pruned for the image. Squares to the right of the tree show a profile of presences and absences of Pfam domains as predicted in each of the proteins included. Grey squares indicate the domain is present and white squares indicate that the domain is absent. (B) Agaricales species tree showing the evolutionary history of the hispidin synthase. “D” indicates where the duplication happened that led to the formation of hispidin synthase. Red crosses indicate where the hispidin synthase was lost. Scales in both trees estimate number of substitutions per site.

13

Figure S7. SDS-PAGE analysis of luciferase expression in E.coli. Lanes: 1 - PageRuler protein ladder (Thermo Scientific). 2- Cells before induction with IPTG. 3 - Cells after overnight expression at 25oC. 4 – Cells lysate, 140,000xg supernatant. 5 – Inclusion bodies solution in 8 M Urea. Gel was stained with Coomassie Blue G-250. nnLuz marked with an arrow. A 10-25% gradient polyacrylamide gel (AA:bisAA - 37.5:1) was used.

Figure S8. Western blot detection of luciferase expression in Pichia pastoris using anti-HisTag antibody. Lanes: 1 – lysate of cells expressing luciferase without HisTag, 5,000xg supernatant. 2 – lysate of cells expressing luciferase with HisTag, 5,000xg supernatant. 3 – lysate of cells expressing luciferase with HisTag, 140,000xg supernatant. 4 – lysate of cells expressing luciferase with HisTag, 140,000xg pellet.

14

Figure S9. Bioluminescence spectra of Neonothopanus nambi luciferase. Solid line shows the spectrum of Neonothopanus nambi mycelium, dashed line shows the spectrum of recombinant luciferase expressed in Pichia pastoris.

Figure S10. Activity of nnLuz in various pH and temperature conditions. Total light emission within 100 s period following the addition of the luciferin, in buffers with various pH (A) or after 10 min incubation of recombinant purified nnLuz at various temperatures (B). The data represent the mean ± SD for at least three measurements. All measurements were obtained on a Promega Glomax 20/20 luminometer. See Supplementary materials section “pH and temperature dependence of recombinant nnLuz chemiluminescence” for details.

15

Figure S11. nnLuz expression in Xenopus laevis embryo. Right embryo was microinjected with the mixture of rhodamine lysine dextran and nnLuz mRNA at two-cell stage, and then was microinjected with luciferin into the blastocoel cavity at gastrula stage. Left embryo – negative control (the same but without luciferase mRNA). (A) photo in bioluminescence channel; (B) photo in transmitted light; (C) rhodamine fluorescence; (D) photo in incident light.

Figure S12. Autonomously bioluminescent transformants of Pichia pastoris carrying genomic copies of luciferin biosynthesis genes (nnhisps, nnh3h, nnluz and npgA) and caffeic acid biosynthesis genes (TAL, hpaB and hpaC). (A) Image under ambient light. (B) Image in the bioluminescence channel.

16

Figure S13. In vivo assay for nnLuz activity in Pichia pastoris with negative controls. Clones of P. pastoris cells, carrying genomic copies of nnluz gene in incident light and in the darkness before and after spraying with solutions of caffeic acid (0.018 M L-1), hispidin (0.215 OD ml-1) and fungal luciferin (100 µg ml- 1) in PBS with 1% DMSO. Spraying of the same dish was made sequentially by caffeic acid, hispidin and fungal luciferin. Each dish was sprayed by 450-500 ul of solution.

17

Figure S14. In vivo assay for nnH3H activity in Pichia pastoris with negative controls. Clones of P. pastoris cells, carrying genomic copies of nnLuz and nnh3h genes in incident light and in the darkness before and after spraying with solutions of caffeic acid (0.018 M L-1), hispidin (0.215 OD ml-1) and fungal luciferin (100 µg ml-1) in PBS with 1% DMSO. The same dish was used for sequential spraying with caffeic acid and hispidin. Each dish was sprayed by 450-500 ul of solution.

18

Figure S15. In vivo assay for nnHispS activity in Pichia pastoris with negative controls. Clones of P. pastoris cells, carrying genomic copies of nnhisps, nnh3h, nnluz and npgA genes in incident light and in the darkness before and after spraying with solutions of caffeic acid (0.018 M L-1) and hispidin (0.215 OD ml-1) in PBS with 1% DMSO. Each dish was sprayed by 450-500 ul of solution.

19

Figure S16. Autonomously bioluminescent Pichia pastoris strain. Colonies of Pichia pastoris strain carrying genomic copies of luciferin biosynthesis genes (nnhisps, nnh3h, nnluz and npgA) and caffeic acid biosynthesis genes (TAL, hpaB and hpaC) in incident light and in the darkness.

Figure S17. Negative control for luciferin, hispidin and caffeic acid luminescence assays. Clones of wild type P. pastoris cells in incident light and in darkness before and after spraying with solutions of caffeic acid (0.018 M L-1), hispidin (0.215 OD ml-1) and fungal luciferin (100 µg ml-1) in PBS with 1% DMSO. The same dish was sprayed sequentially by caffeic acid, hispidin and fungal luciferin. Each dish was sprayed by 450- 500 ul of solution.

20

Figure S18. Cell viability after the addition of fungal luciferin. X axis - fungal luciferin concentration (ug/mL), Y axis - absorbance at 570 nm (reference - 600 nm). Each point on the chart is the average of three values. Dashed lines mark maximum value and half of the maximum value. Cell viability is proportional to absorbance at 570 nm, so LC50 relates to fungal luciferin concentration of approximately 300 ug/mL.

Figure S19. Transfection efficiency. Cotransfection of HEK293T cells with plasmids nnLuz-C1 and pTurboFP635-C1. Luminescence was detected in the green channel with the longpass filter (>505 nm), fluorescence was detected in the red channel with the bandpass filter (607-682 nm).

21

Figure S20. Photometric analysis of growth rates of Pichia pastoris wild type strain GS115 and its autoluminescent derivative. Data represent mean ± s.d. (n = a×b = 9, where a = 3 biological replicates and b = 3 of technical replicates from one experiment).

Figure S21. Conversion of hispidin into luciferin in vitro. The figure shows kinetics of bioluminescence detected after addition of synthetic hispidin and NADPH to cell lysates of wild type Pichia pastoris strain and strain expressing nnh3h.

22

Figure S22. Bioluminescence log intensity of E. coli cells lysates. The figure shows kinetics of bioluminescence detected after addition of synthetic luciferin to cell lysates of the “wild type” E. coli strain (BL21 DE3 CodonPlus) and the strain expressing nnluz.

23 Supplementary tables

Table S1. List of species used. First column indicates species code, second column indicates species name, third and fourth columns indicate source and date of collection. Fifth column shows whether the proteome was predicted by us or not. Last column indicates whether the species is bioluminescent or not.

Bioluminescent clade Species Species name and version of genome Collection Predicted Source Luminous? code assembly, if applicable date proteome ARMFU Armillaria fuscipes NCBI Sep 2016 Yes Yes ARMGA Armillaria gallica 21-2 v1.0 JGI Sep 2016 - Yes ARMME Armillaria mellea JGI Sep 2016 - Yes ARMOS Armillaria ostoyae 28-4 v1.0 JGI Sep 2016 - Yes GYMAN Gymnopus androsaceus JB14 v1.0 JGI Sep 2016 - No GYMLU Gymnopus luxurians v1.0 JGI Sep 2016 - No GUYNE Guyanagaster necrorhiza MCA 3950 JGI Sep 2016 - No HYMRA Hymenopellis radicata IJFM A160 v1.0 JGI Sep 2016 - No LENED Lentinula edodes NCBI Feb 2017 Yes No MARFI Marasmius fiardii PR-910 v1.0 JGI Sep 2016 - No MONPE Moniliophthora perniciosa FA553 JGI Sep 2016 - No MYCCH Mycena chlorophos Uniprot Feb 2017 - Yes This MYCCI Mycena citricolor Sep 2016 Yes Yes study This NEOGA Feb 2017 Yes Yes study This NEONA Neonothopanus nambi Sep 2016 Yes Yes study OMPOL Omphalotus olearius JGI Sep 2016 - Yes OUDMU Oudemansiella mucida CBS 558.79 v1.0 JGI Sep 2016 - No Panellus stipticus (bioluminescent strain from This PANBI Feb 2017 Yes Yes Blount county, Tennessee, USA) study Panellus stipticus (non-bioluminescent strain from This PANNB Feb 2017 Yes No Turkey) study RHOBU Rhodocollybia butyracea AH 40177 v1.0 JGI Sep 2016 - No

Additional species Species Collection Predicted Species name Source Luminous? code date proteome AGABI bisporus JGI Sep 2016 AGRPE Agrocybe pediades AH 40210 v1.0 JGI Sep 2016 AMAMU Amanita muscaria Koide v1.0 JGI Sep 2016 AMATH Amanita thiersii Skay4041 v1.0 JGI Sep 2016 BOLVI Bolbitius vitellinus SZMC-NL-1974 v1.0 JGI Sep 2016 COPCI Coprinopsis cinerea JGI Sep 2016 COPMA Coprinopsis marcescibilis CBS121175 v1.0 JGI Sep 2016 COPMI Coprinellus micaceus FP101781 v2.0 JGI Sep 2016 - No COPPE Coprinellus pellucidus v1.0 JGI Sep 2016 COPSC Coprinopsis sclerotiger v1.0 JGI Sep 2016 CORGL Cortinarius glaucopus AT 2004 276 v2.0 JGI Sep 2016 CREVA Crepidotus variabilis CBS 506.95 v1.0 JGI Sep 2016 CRULA Crucibulum laeve CBS 166.37 v1.0 JGI Sep 2016 CYAST Cyathus striatus AH 40144 v1.0 JGI Sep 2016 FISHE Fistulina hepatica v1.0 JGI Sep 2016

24 GALMA Galerina marginata v1.0 JGI Sep 2016 GYMCH Gymnopilus chrysopellus PR-1187 v1.0 JGI Sep 2016 GYMJU Gymnopilus junonius AH 44721 v1.0 JGI Sep 2016 HEBCY Hebeloma cylindrosporum h7 v2.0 JGI Sep 2016 HYPSU Hypholoma sublateritium v1.0 JGI Sep 2016 LACAM Laccaria amethystina LaAM-08-1 v2.0 JGI Sep 2016 LACBI Laccaria bicolor v2.0 JGI Sep 2016 LEPNU Lepista nuda CBS 247.69 v1.0 JGI Sep 2016 LEUGO Leucoagaricus gongylophorus JGI Sep 2016 MACFU Macrolepiota fuliginosa v1.0 JGI Sep 2016

PHOCO Pholiota conissans CIRM-BRFM 674 v1.0 JGI Sep 2016 - No PLEER eryngii ATCC 90797 v1.0 JGI Sep 2016 - No PLEOS Pleurotus ostreatus PC15 v2.0 JGI Sep 2016 - No PLICR Plicaturopsis crispa v1.0 JGI Sep 2016 - No PLUCE Pluteus cervinus NL-1719 v1.0 JGI Sep 2016 - No

RICFI Rickenella fibula HBK330-10 v1.0 JGI Sep 2016 - No RICME Rickenella mellea v1.0 (SZMC22713) JGI Sep 2016 - No SCHCO Schizophyllum commune H4-8 v3.0 JGI Sep 2016 - No TRIMA Tricholoma matsutake 945 v3.0 JGI Sep 2016 - No VOLVO Volvariella volvacea V23 JGI Sep 2016 - No

25 Table S2. Primers used in this study

# Primers used in this study

1 5’-AATTGGATCCCAGAGGCTCGC-3’

2 5’-GGCCGCGAGCTCTGGGATCC-3’

3 5’-CCTAGGAAATTTTACTCTGCTGGA-3’

4 5’-GCAAATGGCATTCTGACATCC-3’

5 5’-AAGGGAACATGTCTCATATTCAACGGGAAACGTCTTG-3’

6 5’-AAAAACACGAAGTGTTAGAAAAACTCATCGAGCATCAAA-3’

7 5’-TCCTCACCATGGCCAAGCCTTTGTCTCAAG-3’

8 5’-ACAGACCACGAAGTGTTAGCCCTCCCACACATAACCA-3’

9 5’-AACCCATCATGAAAAAGCCTGAACTCACCG-3’

10 5’-TCTTTAGCGCTTTATTCCTTTGCCCTCGGACG-3’

11 5’-CCTCCTTCGAAATGCGCATTAACATTAGCCTCTC-3’

12 5’-TCTTTGTCGACTTACTTGGCGTTTTCTACAATCTTTC-3’

13 5’-TGTTGGGATCCGCATCGTTTGAGAATTCTCTAAG-3’

14 5’-ATATAGCGGCCGCTCAGGCAGAATTAGAACTTCTTAGGA-3’

15 5’-GGATCCCGCATTAACATTAGCCTCTC-3’

16 5’-AAGCTTTCACTTGGCGTTTTCTACAATCTTTC-3’

17 5’-CCTCCTTCGAAATGCGCATTAACATTAGCCTCTC-3’

18 5’-TCTTTGTCGACCTTGGCGTTTTCTACAATCTTTC-3’

26 19 5’-AGTCGCTAGCACCGCCATATGCGCATTAACATTAGCCTCT-3’

20 5’-AGTCAGATCTТСАCTTGGCGTTTTCTACAATCT-3’

27 Table S3. Summary of whole-genome sequencing results of autoluminescent P. pastoris strain.

Standard Predicted Gene Plasmid Average coverage deviation copy number

luz nnLuz/pGAP_Kan 324.90 28.33 1.00 h3h nnH3H/GAP-pPic9 405.63 28.96 1.00 hisps nnHispS/pGAPZ 799.71 61.58 2.00 npgA NpgA/pGAP-Bsd 1978.97 107.29 4.00

TAL RcTAL/pGAP-Hyg 390.56 32.10 1.00 hpaB HpaB/pGAP-Hyg 350.06 30.00 1.00 hpaC HpaC/pGAP-Hyg 90.58 11.73 0.20*

All genes 483.69 82.26

* The coverage of hpaC gene may be explained by insufficient selection of cells after transformation, due to the presence of the same antibiotic resistance gene in other constructs (RcTAL/pGAP-Hyg and HpaB/pGAP-Hyg).

28 References:

1. Chomczynski P, Sacchi N (1987) Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal Biochem 162(1):156–159.

2. Cregg JM, Barringer KJ, Hessler AY, Madden KR (1985) Pichia pastoris as a host system for transformations. Mol Cell Biol 5(12):3376–3385.

3. Cregg JM, Madden KR (1989) Use of site-specific recombination to regenerate selectable markers. Mol Gen Genet 219(1-2):320–323.

4. Cregg JM, Madden KR, Barringer KJ, Thill GP, Stillman CA (1989) Functional characterization of the two alcohol oxidase genes from the yeast Pichia pastoris. Mol Cell Biol 9(3):1316–1323.

5. Wu S, Letchworth GJ (2004) High efficiency transformation by electroporation of Pichia pastoris pretreated with lithium acetate and dithiothreitol. Biotechniques 36(1):152–154.

6. Lõoke M, Kristjuhan K, Kristjuhan A (2011) Extraction of genomic DNA from yeasts for PCR-based applications. Biotechniques 50(5):325–328.

7. Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv [q-bioGN]. Available at: http://arxiv.org/abs/1303.3997.

8. Laemmli UK (1970) Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227(5259):680–685.

9. Shevchenko A, Wilm M, Vorm O, Mann M (1996) Mass spectrometric sequencing of proteins silver-stained polyacrylamide gels. Anal Chem 68(5):850–858.

10. Shevchenko A, Tomas H, Havlis J, Olsen JV, Mann M (2006) In-gel digestion for mass spectrometric characterization of proteins and proteomes. Nat Protoc 1(6):2856–2860.

11. Keller O, Kollmar M, Stanke M, Waack S (2011) A novel hybrid gene prediction method employing protein multiple sequence alignments. Bioinformatics 27(6):757–763.

12. Slater GSC, Birney E (2005) Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6:31.

13. Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5:113.

14. Capella-Gutierrez S, Silla-Martinez JM, Gabaldon T (2009) trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25(15):1972– 1973.

15. Stamatakis A, Ludwig T, Meier H (2005) RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21(4):456–463.

16. Lassmann T, Sonnhammer EL (2005) Kalign – an accurate and fast multiple sequence alignment algorithm. BMC Bioinformatics 6:298.

17. Landan G, Graur D (2007) Heads or tails: a simple reliability check for multiple sequence alignments. Mol Biol Evol 24(6):1380–1383.

18. Wallace IM, O’Sullivan O, Higgins DG, Notredame C (2006) M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res 34(6):1692–

29 1699.

19. Gascuel O (1997) BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol 14(7):685–695.

20. Akaike H (1973) Information theory and extension of the maximum likelihood principle. Proceedings of the 2nd International Symposium on Information Theory, pp 267–281.

21. Le SQ, Gascuel O (2008) An improved general amino acid replacement matrix. Mol Biol Evol 25(7):1307–1320.

22. Huerta-Cepas J, Serra F, Bork P (2016) ETE 3: reconstruction, analysis, and visualization of phylogenomic data. Mol Biol Evol 33(6):1635–1638.

23. Lee I-K, Yun B-S (2011) Styrylpyrone-class compounds from medicinal fungi Phellinus and Inonotus spp., and their medicinal importance. J Antibiot 64(5):349–359.

30