Unveiling of a novel bifunctional photoreceptor, Dualchrome1, isolated from a cosmopolitan green alga

Yuko Makita RIKEN Center for Sustainable Resource Science Shigekatsu Suzuki Center for Environmental Biology and Ecosystem Studies, National Institute of Environmental Studies Keiji Fushimi Shizuoka University Setsuko Shimada RIKEN Center for Sustainable Resource Science Aya Suehisa RIKEN Center for Sustainable Resource Science Manami Hirata RIKEN Center for Sustainable Resource Science Tomoko Kuriyama RIKEN Center for Sustainable Resource Science Yukio Kurihara RIKEN Hidehumi Hamasaki RIKEN Center for Sustainable Resource Science Kazutoshi Yoshitake The University of Tokyo Tsuyoshi Watanabe Japan Fisheries Research and Education Agency Takashi Gojobori King Abdullah University of Science and Technology Tomoko Sakami Japan Fisheries Research and Education Agency Rei Narikawa Shizuoka University https://orcid.org/0000-0001-7891-3510 Haruyo Yamaguchi National Institute for Environmental Studies https://orcid.org/0000-0001-5269-1614 Masanobu Kawachi National Institute for Environmental Studies Minami Matsui (  [email protected] ) RIKEN Center for Sustainable Resource Science

Article

Keywords: Blue-light Sensing, Chimeric Gene, Prasinophyte, Light adaptation Mechanisms, Light- harvesting Complex

Posted Date: October 14th, 2020

DOI: https://doi.org/10.21203/rs.3.rs-88004/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License

1 Title: Unveiling of a novel bifunctional photoreceptor, Dualchrome1, isolated

2 from a cosmopolitan green alga

3

4 Authors: Yuko Makita1†, Shigekatu Suzuki2†, Keiji Fushimi3†, Setsuko Shimada1†, Aya

5 Suehisa1, Manami Hirata1, Tomoko Kuriyama1, Yukio Kurihara1, Hidefumi Hamasaki1,4,

6 Kazutoshi Yoshitake5, Tsuyoshi Watanabe6, Takashi Gojobori7, Tomoko Sakami8, Rei

7 Narikawa3, Haruyo Yamaguchi2, Masanobu Kawachi2, Minami Matsui1*.

8

9 Affiliations:

10 1Synthetic Genomics Research Group, RIKEN Center for Sustainable Resource Science,

11 Yokohama 230-0045, Japan.

12 2Center for Environmental Biology and Ecosystem Studies, National Institute for Environmental

13 Studies, Tsukuba 305-8506, Japan

14 3Graduate School of Integrated Science and Technology, Shizuoka University, 422-8529,

15 Shizuoka, Japan; Research Institute of Green Science and Technology, Shizuoka University,

16 Ohya, Suruga-ku, Shizuoka 422-8529, Japan; Core Research for Evolutional Science and

17 Technology, Japan Science and Technology Agency, Saitama 332-0012, Japan.

18 4Yokohama City University Kihara Institute for Biological Research, Totsuka, Yokohama,

19 Kanagawa, Japan.

1

20 5Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi,

21 Bunkyo, Tokyo, 113-8657, Japan

22 6Tohoku National Fisheries Research Institute, Japan Fisheries Research and Education Agency,

23 Shiogama, Miyagi 985-0001, Japan.

24 7Computational Bioscience Research Center, King Abdullah University of Science and

25 Technology, Thuwal, Kingdom of Saudi Arabia.

26 8Research Center for Aquaculture Systems, National Research Institute of Aquaculture, Japan

27 Fisheries Research and Education Agency, Minami-ise, Mie 516-0193, Japan.

28

29 *Corresponding author. Email: [email protected]

30 †These authors contributed equally to this study.

31

32

2

33 Abstract

34 Photoreceptors are conserved in to land plants, and regulate various developmental

35 stages. In the ocean, blue light penetrates deeper than red light, and blue-light sensing is key to

36 adapting to marine environments. A search for blue-light photoreceptors in the marine

37 metagenome uncovered a novel chimeric gene composed of a phytochrome and a cryptochrome

38 (Dualchrome1, DUC1) in a prasinophyte, provasolii. DUC1 detects light within the

39 orange/far-red and blue spectra, and acts as a dual photoreceptor. Its complete genome revealed

40 that P. provasolii facilitates light adaptation mechanisms via pheophorbide a oxygenase (Pao)

41 and prasinoxanthin. Genes for the light-harvesting complex (LHC) are duplicated and

42 transcriptionally regulated under monochromatic orange/blue light, suggesting P. provasolii has

43 acquired environmental adaptability to a wide range of light spectra and intensities.

44

45

46

3

47 Main text:

48 Photosynthetic organisms utilize various wavelengths of light, not only as sources of energy but

49 also as clues to assess their environmental conditions. Blue light penetrates deeper into the

50 ocean, whereas red light is absorbed and immediately decreases at the surface. Oceanic red algae

51 possess blue-light receptor cryptochromes (CRYs) but not red-light receptor phytochromes

52 (PHYs)1. Similarly, most chlorophytes have CRYs but fewer have PHYs. PHYs are bilin-

53 containing photoreceptors for the red/far-red light response. Interestingly, algal PHYs are not

54 limited to red and far-red responses. Instead, different algal PHYs can sense orange, green, and

55 even blue light2. They have the ability to photoconvert between red-absorbing Pr and far-red-

56 absorbing Pfr, and this conformational change enables interactions with signaling partners3. CRY

57 is a photolyase-like flavoprotein and widely distributed in bacteria, fungi, animals and plants.

58 In 2012-2014, large-scale metagenome analyses were performed in Sendai Bay, Japan, and the

59 western subarctic Pacific Ocean after the Great East Japan Earthquake to monitor its effects on

60 the ocean (http://marine-meta.healthscience.sci.waseda.ac.jp/crest/metacrest/graphs/). These

61 metagenome analyses targeted eukaryotic marine microorganisms as well as bacteria. Blue-light

62 sensing is key to adapting to marine environments. From a search for CRYs in the marine

63 metagenomic data, we found a novel photoreceptor, designated as Dualchrome1 (DUC1),

64 consisting of a two-domain fusion of PHY and CRY. We found that DUC1 originated from a

65 prasinophyte alga, Pycnococcus provasolii.

66 P. provasolii is a marine coccoid alga in , (or

67 prasinophytes clade V4) and was originally discovered in the pycnocline5. P. provasolii is

68 classified in , which is a sister group of the Streptophyta in .

69 Chlorophyta contain three major algal groups, Ulvophyceae, Trebouxiophyceae and

4

70 (UTC clade), and “prasinophytes”, which have several characteristics considered

71 to represent the last common ancestor of Viridiplantae. Prasinophytes mainly inhabit marine

72 environments and are dominant algae under various light qualities and intensities6. Thus,

73 prasinophytes are key to understanding the diversity and evolutionary history of the light

74 response system in the Viridiplantae.

75 Environmental DNA researches show P. provasolii lives at depths in the range of 0-100 m and

76 in varying regions of the marine6,7. It also has a unique pigment composition (prasinoxanthin and

77 Mg-2,4-divinylphaeoporphyrin α5) and an ability to adapt to the spectral quality (blue and blue-

78 violet) and low fluxes of light found in the deep euphotic zone of the open sea5,8.

79 We unveil herewith DUC’s ability to detect a wide range of the light spectrum (orange to far-

80 red for PHY and UV to blue for CRY) and undergo dual photoconversions at both the PHY and

81 CRY regions. We sequenced the complete genome of P. provasolii and examined its general

82 light-associated features. These findings will help to understand the evolutionary diversity of

83 photoreceptors in algae, and explain the environmental adaptation and success of P. provasolii.

84

85

86 Results

87 Identification of novel photoreceptor DUC1 from marine metagenome data

88 Since CRYs have been reported in several chlorophytes, we searched the marine metagenome

89 data with cryptochrome PHR and FAD as baits to target CRY genes. We called 542 assembled

90 metagenome sequences with the PHR, FAD and Rossmann-like_a/b/a_fold domains, and used a

91 combination of these three domains as a hallmark for CRY candidates. Among the candidate

5

92 genes identified, we found one with PHY-like sequence at its N-terminal region similar to

93 Arabidopsis phytochrome B (PHYB) (Fig. 1). This gene has no introns and can encode

94 angiosperm PHY domains at its N-terminus and CRY domains at its C-terminus (Fig. 1A).

95 In metagenome data from four collection points, fragments of this gene were detected in open

96 sea areas (C12 and A21 points) at both the Sendai Bay and A-line sampling stations (Fig. 1B).

97 To find the host plankton or its relative, we searched for a similar sequence in MMETSP (The

98 Marine Microbial Eukaryote Transcriptome Sequencing Project). We found transcriptome

99 fragments that matched with 100% identity to this gene, although we did not isolate the boundary

100 sequence between the PHY and CRY regions. We identified the host plankton as a marine

101 prasinophyte alga, Pycnococcus provasolii. Fortunately, a culture strain of it was stocked in the

102 NIES collection as P. provasolii NIES-2893 (Fig. 1C). Metagenome data also supported the fact

103 that there was enrichment of this species in the subsurface chlorophyll maximum layer of

104 stations C12 and A21 (Fig. S1). After amplification of this gene from NIES-2893 by RT-PCR,

105 we finally concluded that it does have both the PHY and CRY domains (5,073 bp, 183kDa

106 protein) and designated it as Dualchrome1 or DUC1 (Fig. S2A).

107 Phylogenetic analysis using the P. provasolii PHY (PpPHY) and CRY(PpCRY) domains

108 showed they branched with those of other chlorophytes (Fig. S3, S4), which is consistent with

109 the phylogenetic position of this species based on the 18S rRNA gene. Additionally, we found

110 that three other strains of P. provasolii and marina NIES-1419, which is a

111 close relative of P. provasolii, also possess the DUC1 gene. P. marina DUC1 showed 98.2%

112 amino acid identity with the P. provasolii DUC1 (PpDUC1) (Fig. S2B, C).

113

114

6

115 PpDUC1 senses blue, orange and far-red light

116 To verify whether DUC1 possesses photoconverter activity, we expressed the PHY (PpPHY,

117 70.2 kDa, amino acid positions 1–662 in PpDUC1) and CRY (PpCRY, 72.2 kDa, amino acid

118 positions 1039–1690 in PpDUC1) regions with His-tag and GST-tag, separately (Fig. 2A).

119 PHYs have a Cys residue either within the GAF domain or in the N-terminal loop region to

120 covalently bind to the bilin chromophore (Fig. S5A). The bilin chromophore shows light-induced

121 Z/E isomerization that triggers a reversible photocycle between the dark state (or ground state)

122 and the photoproduct state (or excited state) of the PHYs (Fig. S5A)9. The GAF Cys ligates to

123 C31 of phycocyanobilin (PCB) or phytochromobilin (PΦB), whereas the N-terminal Cys ligates

124 to C32 of biliverdin IXα (BV). As the PpPHY possesses the GAF Cys residue but not the N-

125 terminal Cys, the binding chromophore is likely to be PCB or PΦB. Furthermore, we detected a

126 bilin reductase homolog from the P. provasolii NIES-2893 genome (described later), which

127 belongs to the PcyA cluster, producing PCB from BV, but not to the HY2 cluster, producing

128 PΦB from BV (Fig. S6AB). Taken together, we presumed that the native chromophore for

129 PpPHY is PCB and not PΦB. Therefore, we expressed PpPHY in E. coli harboring the PCB-

130 synthetic system (C41_pKT271)10. The purified PpPHY covalently bound PCB (Fig. S5B, Fig.

131 S6C) and showed reversible photoconversion between an orange-absorbing (Po) form (λmax, 612

132 nm) in the dark state and a far-red-absorbing (Pfr) form (λmax, 702 nm) in the photoproduct state

133 (Fig. 2B, Table S1). Dark reversion from the photoproduct state to the dark state was not seen

134 (Fig. S7A). Furthermore, we observed that PΦB but not BV could be efficiently incorporated

135 into the apo-PpPHY to show reversible photoconversion by using the PΦB- and BV-synthetic

136 systems, which is consistent with previous studies (Fig. S6C-E). Taking the presence of the

7

137 PcyA homolog into consideration, it is likely that PpPHY reversibly senses orange and far-red

138 light via PCB incorporation in vivo.

139 CRYs non-covalently bind to a flavin chromophore, in many cases, flavin adenine

140 dinucleotide (FAD) (Fig. S5C). Some redox state forms of FAD, such as oxidized FAD

•– • 141 (FADOX), anion radical FAD (FAD ), neutral radical FAD (FADH ) and reduced FAD

– 11 142 (FADredH ), are observed in the photocycle of CRYs (Fig. S5C) . As the PpCRY accumulated

143 as insoluble inclusion bodies in the normal expression system, we expressed PpCRY in E. coli

144 harboring the chaperon co-expression system (C41_pG-KJE8). The purified PpCRY non-

145 covalently bound FAD (Fig. S5D, Fig. S6C). The protein showed a blue-absorbing (Pb) form

146 (λmax, 447 nm) before light irradiation (Fig. 2C, Table S1). The spectral form in the dark state

11 147 was similar to that of the FADOX-bound CRYs . Blue-light irradiation resulted in conversion to

148 a UV-absorbing (Puv) form (λmax, 366 and 399 nm) (Fig. 2C, Table S1). The spectral form in the

149 photoproduct state was similar to that of the FAD•–-bound CRYs11. Although we further

150 irradiated the protein with UV-to-blue light, no spectral change was observed. Instead, Puv-to-Pb

151 dark reversion occurred (Fig. S7B). To conclude, PpCRY showed photoconversion from the

•– 152 FADOX-bound form to the FAD -bound one, and the reverse reaction occurred in the dark. This

153 is similar to the photocycle of dCRY, a photoreceptor for circadian clock regulation in

154 Drosophila melanogaster12, rather than CraCRY from the green algae Chlamydomonas

155 reinhardtii13 and AtCRY1 and 2 from the land plant Arabidopsis thaliana14.

156 These results suggest that PpDUC1 is a broadband light sensor that can detects long-

157 wavelength light (i.e. orange to far-red) in the PpPHY region and short-wavelength light (i.e. UV

158 to blue) in the PpCRY region.

159

8

160 PpDUC1 partly complements photoreceptor mutation in Arabidopsis

161 To address the in vivo function of PpDUC1, we examined whether PpDUC1 complements the

162 defective phenotypes of Arabidopsis photoreceptor mutants. As PpDUC1 shows the highest

163 similarity with AtPHYB (28% identity) and AtCRY2 (38% identity), PpDUC1 was expressed in

164 Arabidopsis phyB and cry2 mutants (Fig. S8A). Since AtPHYB and AtCRY2 mediate light

165 inhibition of hypocotyl elongation in seedlings15,16, phyB and cry2 show a long-hypocotyl

166 phenotype compared with wild type. We observed that PpDUC1 partly suppressed the long-

167 hypocotyl phenotype of cry2 under blue light (Fig. 3AB), although no phenotype was seen in

168 darkness (Fig. 3C). The phyB mutants expressing PpDUC1 did not show any apparent effects on

169 hypocotyl elongation under orange and red light (Fig. S9). Since PpPHY can incorporate PΦB in

170 vitro with orange shift (Fig.S5B), it is possible that PpDUC1 cannot transduce downstream

171 signals in Arabidopsis.

172

173 PpDUC1 localizes at nucleus in tobacco leaves

174 It is reported that light conditions change the intracellular localization of photoreceptors. We

175 examined PpDUC1 localization using tobacco (Nicotiana benthamiana) cells. PpDUC1::GFP, as

176 well as the PHY region of PpDUC1 (PpPHY::GFP) and the CRY region of PpDUC1

177 (PpCRY::GFP), were introduced into tobacco cells (Fig. 3D). Expression of the GFP-fused

178 proteins was confirmed by protein-blot analysis (Fig. 3E). Arabidopsis phyB is reported to

179 localize in the nucleus after light irradiation17 but, although PpPHY::GFP contains all the PHY

180 sequences homologous to AtPHYB, it localized in the cytoplasm (Fig. 3F). Arabidopsis cry2

181 localizes in the nucleus18 and PpCRY::GFP was found mainly here with a weak signal in the

182 cytoplasm. PpDUC1::GFP localized in the nucleus.

9

183

184 P. provasolii responds to orange and far-red light for gene expression

185 We performed RNA-seq analysis to know how P. provasolii responds to monochromatic light

186 after dark acclimatization. The expression of 472 genes and 102 genes changed after orange and

187 far-red light irradiation, respectively. Since there is no photoreceptor that absorbs orange and/or

188 far-red light in P. provasolii, these changes of expression may be through photoperception by

189 PpDUC1. Orange and blue light commonly regulate 388 genes (Fig. 4A). All of these genes

190 fluctuate similarly in both orange and blue light, indicating the involvement of PpDUC1. GO

191 enrichment analysis showed that many of their categories are related to photosynthesis or light

192 responses (Fig. 4B). In P. provasolii, expression of the light-harvesting complex genes (LHC),

193 including 16 duplicated Lhcp, are induced by both orange and blue light, not by far-red light

194 (Fig. 4C).

195

196 P. provasolii possesses a unique gene repertoire for adaptation to various light conditions

197 We sequenced the complete genome of P. provasolii NIES-2893 (Fig. 1CD). The reads were

198 assembled into 43 scaffolds without gaps. The assembly showed high completeness (Table S2,

199 S3; Supporting text), and the scaffolds likely correspond to chromosomes. The genome possesses

200 some unique features (Fig. S10–S24; Supporting Text). The genome of P. provasolii is 22.7

201 Mbp, which is similar to other prasinophytes (12.5–22.0 Mbp) (Table 1). We predicted and

202 annotated 11,337 genes in the genome (Table 1) and compared orthogroups19 among some

203 prasinophytes: P. provasolii, Chloropicon primus, and five species in Mamiellophyceae (Fig.

10

204 S22). There were 3,996 conserved orthogroups (47.8% of P. provasolii genes) and a large

205 number of unique ones (3,636 orthogroups, 43.5% of the total).

206 We annotated five CRYs and one DUC1 but there is no PHY in the genome. Of the CRY genes,

207 two and the PpCRY region of PpDUC1 belong to the plant CRYs (pCRY13). These three genes

208 form a monophyly and branch at the basal position of those of other chlorophytes (Fig. S3).

209 Chlorophyte CRYs are a sister group of the streptophyte pCRYs, including AtCRY1 and AtCRY2.

210 The PpPHY region is a monophyly with other prasinophytes and streptophytes (Fig. S4). In the

211 prasinophytes, the genomes of some mamiellophyceans lack genes for pCRY and PHY (Fig. 5).

212 The predicted genes suggest that several important light signal transduction genes in land

213 plants are conserved. We detected COP1, CK2alpha and HY5 homologs but not those for PIF

214 and SPA1 for red light, nor CIB1 or BIC for blue-light signal transduction. We also confirmed

215 the conservation of light signal transduction genes in 13 chlorophyte genomes, but PIF, CIB1

216 and BIC were not found in any (Table S5).

217 In addition to photoreceptor and signaling genes, other genes encoded related to adaptation to

218 different light conditions. We predict that P. provasolii possesses 165 putative flagellar proteins

219 (Table S6). These genes compose a full set for flagellate structure and maintenance, which

220 indicates the existence of a flagellate stage of P. provasolii, as described in a previous

221 observation5. These results indicate that P. provasolii can move to favourable environments,

222 including those influenced by light conditions.

223 We found an expansion of some gene families (Table S7–S9; Supporting text). In particular,

224 the P. provasolii genome encoded a large gene family for a unique prasinophyte-specific light-

225 harvesting complex (Lhcp). P. provasolii possesses 16 genes for Lhcp, and 10 of these are

226 clustered in specific regions of chromosome 29 (Fig. S23). These 10 genes had almost the same

11

227 sequences and were monophyletic (Fig. 6A). This suggests that the lhcb gene on chromosome 10

228 was duplicated on chromosome 29 in P. provasolii. Similar duplications of lhcb genes are

229 observed in Mamiellophyceae, however, these have occurred independently to P. provasolii.

230 Interestingly, the P. provasolii genome encodes a gene for pheophorbide a oxygenase (Pao),

231 which is related to chlorophyll degradation. This gene is monophyletic with cyanobacteria and

232 streptophytes, while other prasinophytes do not possess it (Fig. 6). P. provasolii also possesses a

233 gene for magnesium dechelatase (Sgr), but it lacks one for red chlorophyll catabolite reductase,

234 indicating that P. provasolii can degenerate chlorophyll a to red chlorophyll catabolites.

235

236

237 Discussion

238 In this research, we have found a novel dual photoreceptor, PpDUC1, comprised of a fusion of

239 PHY and CRY. In terms of evolution, it is often speculated that different domains of one

240 organism’s protein are encoded by separate genes in another and has been used successfully to

241 speculate about direct physical interaction or indirect functional association20. It is reported that

242 phyB and cry2 physically interact to transduce light signals for controlling flowering time in

243 Arabidopsis21. Our discovery of PpDUC1 indicates that PHY and CRY interact actively to

244 enable proper perception of light signals. Another example is Neochrome, which is found in

245 ferns22, that is composed of a PHY domain and a PHOT domain. This chimeric protein also

246 supports the idea that there is dynamic interaction of photoreceptors.

247

248 Possible function of PpDUC1

12

249 Expression analysis of P. provasolii enabled observation of gene expression under blue and

250 orange light. It was found that 388 genes are expressed under these wavelengths indicating the

251 involvement of PpDUC1(Fig. 4). However, 76 and 49 genes are induced only by orange or far-

252 red light, respectively, and only by the PHY domain of PpDUC1. PpDUC1 localizes exclusively

253 in the nucleus under white light in tobacco cells (Fig. 3F) while the PpPHY domain mainly

254 localizes in the cytoplasm. PpPHY may have a different function from the AtPHYs. This is

255 consistent with the complementation assay that showed PpDUC1 partly complements cry2 but it

256 does not complement phyB (Fig. 3, Fig. S8). In the P. provasolii genome, there is HY5 but no

257 PIF homologs (Table S5) so PpDUC1 may have a unique signal transduction for light-dependent

258 gene expression.

259

260 Possible mechanism for blue-shifted spectral property of PpPHY

261 Photoreceptors of the Chlorophyta perceive various wavelengths of light to utilize sunlight to

262 direct their development, and land plants also have some of these receptors. Whilst most PHYs

263 of land plants perceive 660 nm and 730 nm for their photoconversion, PHYs of the Chlorophyta

264 (among them those of prasinophytes) perceive various wavelengths2. The PHY homologous

265 domain of PpDUC1 absorbs 612 nm and 702 nm for its photoconversion in vitro (Fig. 2b). This

266 shift to a shorter wavelength may be because of the ocean’s light environment. Despite limited

267 knowledge on algal PHYs, Rockwell et al. have reported that they show wide variation in their

268 spectral properties2. Some PHY molecules derived from other prasinophyte species show a

269 Po/Pfr photocycle similar to PpPHY, indicative of the same origin2. It is well known in

270 cyanobacteriochrome photoreceptors, distant relatives of the PHYs, that the trapped geometry of

271 the rotating ring D is crucial for blue-shifted absorption23. Residues unique to prasinophyte PHY

13

272 molecules would be crucial for such a twist of ring D. Notably, two Tyr residues conserved

273 among the plant and cyanobacterial PHYs that hold ring D are replaced with Phe, Met or Trp

274 residues in the prasinophyte PHYs (Fig. S24)24, which may contribute to holding ring D in the

275 trapped geometry, resulting in the absorption of blue-shifted orange light in the dark state Po

276 form.

277

278 Evolutionary diversity in chlorophyte photoreceptors

279 Many green algae share genes for PHY9 and pCRY25. Of the prasinophytes, Tetraselmis,

280 Nephroselmis, Micromonas, Dolichomactix, and Prasinoderma possess genes for functional

281 PHY1,2. However, the genomes of Chloropicon, Ostreococcus, and Bathycoccus lack any PHY

282 genes (Fig. 5). These findings suggest that PHY may have disappeared multiple times,

283 independently of the prasinophytes, from the ancestors of Chlamydomonas, Chloropicon,

284 Pycnococcus, Ostreococcus, and Bathycoccus. However, pCRY is highly conserved in

285 prasinophytes13, and only the genomes of Ostreococcus and Bathycoccus lack pCRY. In

286 evolutionary terms, PpDUC1 may have been generated via the fusion of genes for PHY and

287 pCRY in an ancestor of Pycnococcus (and Pseudoscourfieldia), because it is not found in the

288 genomes and transcriptomes (MMETSP26) of the other prasinophytes. Whole genome

289 duplication and large-scale genome rearrangements in P. provasolii (Supplementary Text) could

290 induce such a gene fusion. The P. provasolii genome lacks a gene for PHY, so it is probable that

291 it was deleted during the process of the fusion because its function was replaced by PpDUC1.

292

293 Dynamic adaptation mechanisms to various light spectra and intensities

14

294 Sensitivity to weak light is essential for marine algae. P. provasolii also has a unique pigment

295 composition (prasinoxanthin and Mg 2,4-D) and an ability to adapt to spectral quality (blue and

296 blue-violet) and low-light intensity5,8. Interestingly, it is not only specialized for weak light; its

297 genome possesses Pao and Sgr that are involved in chlorophyll (Chl) degradation in the

298 chlorophyll-protein complexes27. The Pao genes are mainly conserved in freshwater algae

299 (cyanobacteria and chlorophytes) and land plants, and are absent from the available genomes of

300 marine chlorophytes28. Under high-light radiation, algae and land plants degenerate antenna

301 complexes (LHCs and pigments) to decrease the absorbance of excess energy and prevent

302 photodamage29. The degradation of Chl triggers degradation of LHCs30. Therefore, P. provasolii

303 may adjust the amount of antenna complex by Chl degradation under strong light at the surface

304 of the sea.

305 For photosynthesis in marine environments, Chl b is suitable because its absorption peak is

306 shifted to blue-green compared with Chl a. This use of Chl b is different to land plants. Marine

307 prasinophytes possess Chl b, not only in LHCs but also in PSI core antennae5,31. P. provasolii

308 possesses LHCA, LHCB and a number of prasinophyte-specific LHCs (Lhcp) (Fig. 4 and S22).

309 The Lhcp makes complexes with Chl a/b and carotenoids32. Our transcriptome data showed that

310 all LHC genes were upregulated and some Lhcps were strongly upregulated under orange and

311 blue-light irradiation (Fig. 4). This result is consistent with LHC regulation in land plants.

312 Our studies show the dynamic adaptation mechanisms to light spectra and intensities in P.

313 provasolii. These mechanisms explain the strategies used to receive and utilize light, and to

314 expand the niches of Pycnococcus in the marine ecosystem.

315

15

316 Methods

317 Prediction of novel CRYs and identification of the host organism

318 We used the assembled marine metagenome data sampled at Sendai Bay and the western

319 subarctic Pacific Ocean 2012-2014 (http://marine-

320 meta.healthscience.sci.waseda.ac.jp/crest/metacrest/graphs/) and checked the protein domains

321 using InterProScan v5.40-77.033. Proteins were defined as candidate CRYs if they had the

322 following three domains: "DNA photolyase N-terminal (IPR006050)", "Rossman-like

323 alpha/bata/alpha sandwich fold (IPR014729)" and "Cryptochrome/DNA photolyase, FAD-

324 binding domain (IPR005101)", following the dbCRY method34. We also excluded known CRYs

325 that matched with the NCBI nr database35 by a blastp search36. To find the candidate host

326 organism, we applied a blast search against MMETSP (The Marine Microbial Eukaryote

327 Transcriptome Sequence Project) data37. The culture strain of P. provasolii NIES-2893 was

328 obtained from the National Institute for Environmental Studies, Japan.

329

330 Construction of phylogenetic trees for CRYs, photolyases and PHYs

331 The domain composition of PpDUC1 (1690 aa) from P. provasolii was assigned using

332 SMART(49), Pfam(50) and NCBI CDD(51). We used the PAS-GAF-PHY-PAS-PAS domain for

333 PHY, the PHR domain for CRYs and photolyases to construct the phylogenetic trees. Multiple

334 sequence alignment was performed by MAFFT v7.450 with the “--auto” parameter(52).

335 Neighbor-joining phylogenetic trees were constructed using MEGA7 or MEGA X software(53)

336 with 10,000 bootstrap replicates.

337

338

16

339 Methods for spectral characterization of PHY and CRY

340 Plasmid construction for expression of PpPHY and PpCRY

341 The Escherichia coli strains Mach1 T1R (Thermo Fisher Scientific) and DH5α (TaKaRa) were

342 used for DNA cloning. These cells were grown on Lysogeny Broth (LB) agar medium at 37°C.

343 The transformed cells were selected by 20 µg mL-1 kanamycin or 100 µg mL-1 ampicillin.

344 The PpPHY region (1–1986 bp in PpDUC1) was cloned into Nde I and BamH I sites of the

345 pET28a vector with N-terminal His-tag (Novagen) using the Gibson Assembly System (New

346 England Biolabs, Japan). The DNA fragment of the PpPHY region was amplified by PCR using

347 KOD OneTM PCR Master Mix (Toyobo Life Science) with a synthetic gene that was codon

348 optimized for expression in E. coli (Genscript) and an appropriate nucleotide primer set (forward

349 primer 5’-CGCGGCAGCCATATGCCGGTTCCGGTACCAGCC-3’ and reverse primer 5’-

350 CTCGAATTCGGATCCTTACTGCGCCACCAGCGAGCT-3’). The pET28a vector was

351 amplified by PCR using DNA polymerase with template DNA and an appropriate nucleotide

352 primer set (forward primer 5’-GGATCCGAATTCGAGCTC-3’ and reverse primer 5’-

353 CATATGGCTGCCGCGCGG-3’).

354 The PpCRY region (3115–5073 bp in PpDUC1) was cloned into the EcoR I and Xba I sites of a

355 pCold GST DNA vector with an N-terminal GST-tag (TaKaRa) using restriction enzymes and

356 ligase. A DNA fragment of the PpCRY region was amplified by PCR using PrimeSTAR Max

357 DNA Polymerase (TaKaRa) with genomic DNA from P. provasolii and an appropriate

358 nucleotide primer set (forward primer 5’-

359 ATGCGAATTCCCACCAGACTCCGATGTGAGCGGCG-3’ and reverse primer 5’-

360 GATCTCTAGACTACTGCGATCGATGGCGCTTCGCG-3’).

17

361 Restriction sites on each primer are indicated by underlining. All of the plasmid constructs were

362 verified by nucleotide sequencing (FASMAC).

363

364 Expression and purification of PpPHY fused with His-tag

365 The E. coli strain C41 (Cosmo Bio) was used for His-tagged PpPHY (amino acid positions 1–

366 662 in PpDUC1) expression through the pKT270, pKT271 and pKT272 constructs as biliverdin

367 IXα (BV), phycocyanobilin (PCB) and phytochromobilin (PΦB) synthetic systems,

368 respectively10,38. Bacterial cells were grown in LB medium containing antibiotics (20 µg mL-1

369 kanamycin and 20 µg mL-1 chloramphenicol) at 37°C. For protein expression, after the cells

370 reached an optical density of 0.4–0.8 at 600 nm, isopropyl β-D-1-thiogalactopyranoside (IPTG)

371 was added (final concentration, 0.1 mM), and the cells were cultured at 18°C overnight.

372 After incubation, the culture broth was centrifuged at 5,000 g for 15 min to collect the cells. The

373 cells were resuspended in lysis buffer A (20 mM HEPES–NaOH pH 7.5, 100 mM NaCl and 10%

374 (w/v) glycerol) with 0.5 mM tris(2-carboxyethyl)phosphine, and then disrupted using an

375 Emulsiflex C5 high-pressure homogenizer at 12,000 psi (Avestin). Homogenates were

376 centrifuged at 165,000 g for 30 min and then the supernatants were filtered through a 0.8 µm

377 cellulose acetate membrane before loading on to a nickel-affinity HisTrap HP column (GE

378 Healthcare) using the ÄKTA pure 25 (GE Healthcare) system. His-tagged PpPHY incorporated

379 with each chromophore was purified as described in previous studies39,40. The purified proteins

380 were dialyzed against lysis buffer A containing 1 mM dithiothreitol (DTT). Protein concentration

381 was determined by the Bradford method.

382

383 Expression and purification of PpCRY fused with GST-tag

18

384 The E. coli strain C41 was used for GST-tagged PpCRY (amino acid positions 1039–1690 in

385 PpDUC1) expression through the pG-KJE8 construct as a chaperone expression system

386 (TaKaRa). Bacterial cells were grown in 8 L LB medium containing 500 µg mL-1 L-arabinose

387 and antibiotics (5 ng mL-1 tetracycline, 100 µg mL-1 ampicillin and 20 µg mL-1 chloramphenicol)

388 at 37°C. For protein expression, once the optical density at 600 nm of the cells reached 0.4–0.8,

389 IPTG was added (final concentration, 1 mM), and the cells were cultured at 15°C overnight.

390 After incubation, the culture broth was centrifuged at 5,000 g for 15 min to collect the cells.

391 They were resuspended in lysis buffer B (20 mM HEPES–NaOH pH 7.5, 500 mM NaCl, 5 mM

392 DTT and 10% (w/v) glycerol), and then disrupted using a homogenizer at 12,000 psi. The

393 homogenate was centrifuged at 165,000 g for 30 min and the supernatant filtered through a

394 membrane before loading onto a glutathione-affinity GSTrap HP column (GE Healthcare). The

395 column was washed with lysis buffer B to remove unbound proteins, and GST-tagged PpCRY

396 was subsequently eluted with buffer containing 10 mM reduced glutathione. Protein

397 concentration was determined by the Bradford method. The extracted protein was handled in the

398 dark.

399

400 SDS-PAGE analysis for purified PpPHY and PpCRY

401 Purified proteins were diluted into 60 mM DTT, 2% (w/v) SDS and 60 mM Tris-HCl, pH 8.0,

402 and then denatured at 95ºC for 3 min. These samples were electrophoresed at room temperature

403 using 10% (w/v) polyacrylamide gels with SDS. The gels were soaked in distilled water for 30

404 min followed by monitoring of fluorescence bands for detection of chromophores covalently

405 bound to the proteins. These bands were visualized through a 600-nm long-path filter upon

19

406 excitation with blue (λmax = 470 nm) and green (λmax = 527 nm) light through a 562 nm short-

407 path filter using WSE-6100 LuminoGraph (ATTO) and WSE-5500 VariRays (ATTO) machines.

408

409 UV-Vis spectroscopic analysis to monitor photocycles of PpPHY and PpCRY

410 Ultraviolet and visible absorption spectra of the proteins were recorded with a UV-2600

411 spectrophotometer (SHIMADZU) at room temperature (r.t., approximately 20–25ºC). An Opto-

412 Spectrum Generator (Hamamatsu Photonics, Inc.) was used to produce monochromatic light of

413 various wavelengths to induce photoconversion.

414

415 Assignment of the chromophores incorporated into PpPHY

416 Sample solutions containing the native PpPHY in the dark state and the photoproduct state were

417 diluted five-fold in 8 M acidic urea (pH < 2.0). The absorption spectra were recorded at r.t.

418 before and after 3 min of illumination with white light. Assignment of the chromophores was

419 conducted by comparing the spectra between the denatured PpPHY and standard proteins41–43.

420

421 Assignment of the chromophore incorporated into PpCRY

422 Trichloroacetic acid was added to the sample solution containing the native PpCRY in the dark

423 state to a final concentration of 440 mM. The solution was treated on ice for 1 hour with shaking

424 at 200 rpm. The precipitate was removed by centrifugation at 20,000 g for 5 min, and then 50 μL

425 of the supernatant were injected into a high-performance liquid chromatograph (Prominence

426 HPLC system, Shimadzu). Released chromophore included in the supernatant was eluted

427 isocratically (flow rate, 1 mL / min) with a solvent (MeOH / 20 mM KH2PO4 = 20 / 80) and

428 separated using a reverse-phase HPLC column (InertSustainSwift C18, 4.6 i.d. × 250 mm, 5 μm;

20

429 GL Sciences) at 35ºC. The chromophore was detected by its absorption at 360 nm. Assignment

430 of the chromophores was performed based on their retention times (tR) compared with standard

431 compounds; 5,10-methenyltetrahydrofolate chloride (MTHF chloride, Schircks Laboratories),

432 flavin adenine dinucleotide (FAD, Tokyo Chemical Industry), flavin mononucleotide (FMN,

433 Tokyo Chemical Industry) and riboflavin (RF, Tokyo Chemical Industry). The sample and

434 standard compounds were handled under dark conditions.

435

436 Methods for in vivo functional analysis of PpDUC1 using plants

437 Generating transgenic Arabidopsis for complementation assay

438 PpDUC1 cDNA with an HA sequence and a stop codon was amplified from P. provasolii

439 genomic DNA using the DUC1-F primer (5’-CACCATGCCAGTGCCAGTGCCTG-3’) and

440 DUC1-HA-R primer (5’-CTAAGCATAATCTGGTACATCATATGG

441 ATACTGCGATCGATGGCGCTTCGC-3’). It was firstly cloned into pENTR/D-TOPO

442 (Invitrogen) and subsequently into the binary vector pMDC32 44 containing a CaMV 35S

443 promoter using the Gateway strategy (Thermo Fisher Scientific). The resulting CaMV 35S-

444 PpDUC1 constructs were transformed into phyB (Salk_022635) and cry2-118 mutants via the

445 floral dipping method45. The transgenic plants were screened on 1/2 MS medium containing 25

446 mg/L hygromycin.

447

448 Measurement of hypocotyl length of transgenic Arabidopsis

449 For measurement of hypocotyls, seeds were sown on GM medium, cold treated for three days in

450 the dark, and then exposed to white light for 6 hours to enhance germination. Following the

451 white light treatment, plants were grown at 22℃ under continuous blue light (peak 447 nm, half

21

452 bandwidth of 20 nm), orange light (peak 596 nm, half bandwidth of 20 nm), red light (peak 650

453 nm, half bandwidth of 25 nm) and in the dark. Seven-day-old seedlings were photographed, and

454 the hypocotyl lengths measured using ImageJ software (http://rsb.info.nih.gov/ij/).

455

456 Western blot analysis of PpDUC1-GFP transiently expressed in plants

457 Total protein was extracted from 80 mg of A. thaliana whole plants and N. benthamiana leaf

458 discs using the following buffer: 50 mM Tris-HCl (pH8.0), 150 mM NaCl, 0.1% 2-

459 mercaptoethanol, 5% glycerol, 0.5% Triton X-100, and cOmpleteTM protease inhibitor mini

460 cocktail (Sigma). The slurry was centrifuged twice to remove precipitates, and 10 μl of the SDS-

461 denatured supernatant were loaded onto a 7.5% SDS-polyacrylamide gel for electrophoresis.

462 After electrophoresis, proteins were electroblotted to a polyvinylidene difluoride (PVDF)

463 membrane (Millipore) in the following buffer: 25 mM Tris, 192 mM glycine and 20% methanol.

464 Subsequently, the membrane was incubated for 1.5 h in skimmed milk, rinsed two times with

465 1×TBST, and immunoblotted using peroxidase anti-HA (Roshe 1:1000 dilution) for 1.5 h to

466 detect PpDUC1-HA or rabbit anti-GFP antibody (Torrey Pines Biolabs Inc. 1:2000 dilution) for

467 20 h to detect PpDUC1-GFP. After incubation, the membrane was rinsed three times and

468 incubated for 0.5 h with AmershamTM protein A horseradish peroxidase linked antibody (GE

469 Healthcare Corp. 1:2000 dilution) to detect PpDUC1-GFP. The membrane was then rinsed three

470 times with 1×TBST as before and processed with a luminescent image analyzer (Chem Doc XRS

471 plus, BIO-RAD) after incubation with enhanced chemiluminescence reagents (AmershamTM

472 ECL SelectTM Western Blotting Detection Reagents) (GE Healthcare Corp.).

473

474 Constructs for transient transformation in N. benthamiana

22

475 For infiltration into the leaf epidermal cells of N. benthamiana, the full-length open reading

476 frames of PpDUC1, and parts of PpPHY and PpCRY (without their stop codons) were amplified

477 from P. provasolii genomic DNA by PCR with PpDUC1-F 5’-

478 ggggacaagtttgtacaaaaaagcaggcttcATGCCAGTGCCAGTGCCTGCTGC -3’ and PpDUC1-R 5’-

479 ggggaccactttgtacaagaaagctgggtcCTGCGATCGATGGCGCTTCGC -3’ for PpDUC1, PpPHY F

480 5’-ggggacaagtttgtacaaaaaagcaggcttcATGCCAGTGCCAGTGCCTGCTGC-3’ and PpPHY-R 5’-

481 ggggaccactttgtacaagaaagctgggtcAGCGTATGGCATGCACTGAAC-3’ for PpPHY, PpCRY-F

482 5’-ggggacaagtttgtacaaaaaagcaggcttcATGCCACCAGACTCCGATGTGAGCGGCG-3’ and

483 PpCRY-R 5’-ggggaccactttgtacaagaaagctgggtcCTGCGATCGATGGCGCTTCGC-3’ for PpCRY

484 and cloned into the pDONR207 ENTRY vector by BP recombination, according to the

485 manufacturer’s instructions (Invitrogen Corp.). In the primer sequences, uppercase letters

486 indicate sequences from genomic DNA, and lowercase letters indicate adaptor sequences for the

487 Gateway system. These open reading frames were transformed into the pEarleyGate 103

488 destination vector46 to fuse in frame with GFP and 6×his by LR reaction.

489

490 Transient expression and observation of PpDUC1-GFP in N. benthamiana leaves

491 Each construct was transformed into Agrobacterium GV3101. These agrobacteria were

492 infiltrated into the leaf epidermal cells of N. benthamiana according to Lewis et al., 200847. A

493 FluoView FV1000 (Olympus Corp.) microscope with x20 and x40 objectives was used to detect

494 GFP fluorescence. The excitation wavelength was 488 nm, and a band-pass filter of 510 to 525

495 nm was used for emission.

496

497 Methods for genome and transcriptome analysis

23

498 DNA extraction, genome sequencing, assembly and annotation

499 P. provasolii NIES-2893 cells were cultivated for 4–6 days in ~20 mL IMK medium (Nihon

500 Pharmaceutical, Tokyo, Japan) under white LED light (~50 µmol photons/m2/s) with a 14 h:10 h

501 light:dark cycle. Cells were collected by gentle centrifugation. DNA was extracted as described

502 in Suzuki et al.48. The paired-end library was constructed using a TruSeq DNA PCR-free Kit

503 (lllumina Inc., San Diego, CA, USA), according to the manufacturer’s protocol. The mate-pair

504 library was constructed using a Nextera Mate Pair Library Prep Kit (Illumina), according to the

505 manufacturer’s “gel-free” protocol. The paired-end and mate-pair libraries were sequenced using

506 the MiSeq System (Illumina) with MiSeq Reagent Kits v3 (300 bp × 2) (Illumina). For nanopore

507 sequencing, the library was constructed using a Rapid Sequencing Kit (SQK-RAD003, Oxford

508 Nanopore Technologies, Oxford, UK). Sequencing was performed using a MinION with a

509 SpotON Flow Cell (FLO-MIN106).

510 We sequenced 40,351,502 paired-end (9.7 Gbp), 10,652,248 mate-pair (1.7 Gbp), and 90,971

511 nanopore reads (0.50 Gbp, N50 = 18.8 kbp). The raw reads were assembled into 47 scaffolds

512 (22.9 Mbp) with one gap using MaSuRCA 3.3.249. The nanopore reads were corrected using

513 CONSENT v1.1.250. The paired-end reads were corrected using Trimmomatic version 0.3651.

514 The assembly was re-scaffolded by the corrected nanopore reads using Fast-SG52 with the k-mer

515 size = 70. Gaps were filled using LR_Gapcloser v1.153 with the corrected nanopore reads.

516 Finally, we mapped the corrected paired-reads using BWA version 0.7.1754, and polished the

517 assembly using Pilon 1.2255. The polishing was performed twice.

518 We removed the two sequences of the complete chloroplast and mitochondrial genomes.

519 Genome annotation was performed using the funannotate pipeline v1.5.3

520 (https://github.com/nextgenusfs/funannotate). For construction of the gene models, we used our

24

521 RNA-seq reads (19 Gbp) under various light conditions (described in the transcriptome section)

522 after read correction with Trimmomatic51. Functional annotation was performed using the

523 eggNOG web server56. To check expression of the gene models, we mapped the RNA-seq reads

524 used for the gene-model construction to the coding sequences (CDSs) using minimap2 2.1757.

525 We defined CDSs with RPK (read per kilobase) > 10 as “expressed genes”.

526

527 Transcriptome analysis under monochromatic light

528 The cells of P. provasolii NIES-2893 were grown at 19.5 ℃ with shaking at 120 rpm in a light

529 intensity of 13 µmol m−2 s−1 under a LD12:12 light/dark cycle. To monitor gene expression

530 during irradiation of monochromatic light, cells were plated and synchronized by a 12 h dark

531 treatment before being exposed to monochromatic constant light. Light treatments were

532 performed using LEDs (Nippon Medical & Chemical Instruments Co., Ltd., Japan) of specific

533 wavelengths at an intensity of 7 µmol m−2 s−1. The following settings were used: for blue light,

534 the peak was at 447 nm; for orange light, at 590 nm, and for far-red light, at 700 nm. After light

535 treatment for 2 hours, cells were harvested and frozen with liquid nitrogen. After cell disruption

536 using glass beads, total RNA was extracted from P. provasolii using the TRIzol reagent (Thermo

537 Fisher Scientific. INK, USA). The extracted total RNA was purified using RNeasy Plant Mini

538 Kits (QIAGEN, The Netherlands). Libraries for RNA-seq analysis were synthesized using

539 Illumina® TruSeq® Stranded mRNA Sample Preparation Kits (Illumina. K. K, USA) and

540 sequenced on an Illumina HiSeq 2000 using directed paired-end technology. The sequenced

541 reads were mapped to the P. provasolii genome with STAR v2.5.0c58 after trimming the low

542 quality reads (Phred quality of ≤20) with the FASTX-Toolkit-0.0.14

543 (http://hannonlab.cshl.edu/fastx_toolkit/index.html). Read counts were normalized with DESeq

25

544 in an R package59. Differentially expressed genes (DEGs) were defined with a q-value < 0.05

545 and fold-change (FC) < 2/3 or FC > 1.5.

546

547 Data and materials availability

548 The genome and transcriptome data were deposited in the DDBJ/EMBL/GenBank under the

549 accession number of XXX. The genome browser and tanscriptome data of P. provasolii are

550 available at: http://matsui-lab.riken.jp/JBrowse/index.html?data=data%2Fpycnococcus.

551

552

553 References

554 1. Li, F. W. et al. Phytochrome diversity in green plants and the origin of canonical plant

555 phytochromes. Nat. Commun. 6, 7852 (2015).

556 2. Rockwell, N. C. et al. Eukaryotic algal phytochromes span the visible spectrum. Proc. Natl.

557 Acad. Sci. U. S. A. 111, 3871–3876 (2014).

558 3. Rockwell, N. C., Su, Y. S. & Lagarias, J. C. Phytochrome structure and signaling

559 mechanisms. Annu. Rev. Plant Biol. 57, 837–858 (2006).

560 4. Lemieux, C., Otis, C. & Turmel, M. Six newly sequenced chloroplast genomes from

561 prasinophyte green algae provide insights into the relationships among prasinophyte

562 lineages and the diversity of streamlined genome architecture in picoplanktonic species.

563 BMC Genomics 15, 857 (2014).

564 5. Guillard, R. R. L., Keller, M. D., O’Kelly, C. J. & Floyd, G. L. PYCNOCOCCUS

565 PROVASOLII GEN. ET SP. NOV., A COCCOID PRASINOXANTHIN-CONTAINING

26

566 PHYTOPLANKTER FROM THE WESTERN NORTH ATLANTIC AND GULF OF

567 MEXICO1. Journal of Phycology vol. 27 39–47 (1991).

568 6. Shiozaki, T. et al. Eukaryotic Phytoplankton Contributing to a Seasonal Bloom and Carbon

569 Export Revealed by Tracking Sequence Variants in the Western North Pacific. Front.

570 Microbiol. 10, 2722 (2019).

571 7. Tragin, M., dos Santos, A. L., Christen, R. & Vaulot, D. Diversity and ecology of green

572 microalgae in marine systems: an overview based on 18S rRNA gene sequences.

573 Perspectives in Phycology vol. 3 141–154 (2016).

574 8. Iriarte, A. & Purdie, D. A. Photosynthesis and growth response of the oceanic picoplankter

575 Pycnococcus provasolii Guillard (clone Ω48-23) (Chlorophyta) to variations in irradiance,

576 photoperiod and temperature. J. Exp. Mar. Bio. Ecol. 168, 239–257 (1993).

577 9. Rockwell, N. C. & Lagarias, J. C. Phytochrome evolution in 3D: deletion, duplication, and

578 diversification. New Phytol. 225, 2283–2300 (2020).

579 10. Miyake, K. et al. Functional diversification of two bilin reductases for light perception and

580 harvesting in unique cyanobacterium Acaryochloris marina MBIC 11017. FEBS J. (2020)

581 doi:10.1111/febs.15230.

582 11. Liu, B., Liu, H., Zhong, D. & Lin, C. Searching for a photocycle of the cryptochrome

583 photoreceptors. Curr. Opin. Plant Biol. 13, 578–586 (2010).

584 12. Berndt, A. et al. A novel photoreaction mechanism for the circadian blue light

585 photoreceptor Drosophila cryptochrome. J. Biol. Chem. 282, 13011–13021 (2007).

586 13. Kottke, T., Oldemeyer, S., Wenzel, S., Zou, Y. & Mittag, M. Cryptochrome photoreceptors

587 in green algae: Unexpected versatility of mechanisms and functions. J. Plant Physiol. 217,

588 4–14 (2017).

27

589 14. Lin, C. et al. Association of flavin adenine dinucleotide with the Arabidopsis blue light

590 receptor CRY1. Science 269, 968–970 (1995).

591 15. Reed, J. W., Nagatani, A., Elich, T. D., Fagan, M. & Chory, J. Phytochrome A and

592 Phytochrome B Have Overlapping but Distinct Functions in Arabidopsis Development.

593 Plant Physiol. 104, 1139–1149 (1994).

594 16. Lin, C. et al. Enhancement of blue-light sensitivity of Arabidopsis seedlings by a blue light

595 receptor cryptochrome 2. Proc. Natl. Acad. Sci. U. S. A. 95, 2686–2690 (1998).

596 17. Yamaguchi, R., Nakamura, M., Mochizuki, N., Kay, S. A. & Nagatani, A. Light-dependent

597 translocation of a phytochrome B-GFP fusion protein to the nucleus in transgenic

598 Arabidopsis. J. Cell Biol. 145, 437–445 (1999).

599 18. Guo, H., Yang, H., Mockler, T. C. & Lin, C. Regulation of flowering time by Arabidopsis

600 photoreceptors. Science 279, 1360–1363 (1998).

601 19. Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative

602 genomics. Genome Biol. 20, 238 (2019).

603 20. Marcotte, E. M. et al. Detecting protein function and protein-protein interactions from

604 genome sequences. Science 285, 751–753 (1999).

605 21. Más, P., Devlin, P. F., Panda, S. & Kay, S. A. Functional interaction of phytochrome B and

606 cryptochrome 2. Nature 408, 207–211 (2000).

607 22. Kawai, H. et al. Responses of ferns to red light are mediated by an unconventional

608 photoreceptor. Nature 421, 287–290 (2003).

609 23. Fushimi, K. & Narikawa, R. Cyanobacteriochromes: photoreceptors covering the entire

610 UV-to-visible spectrum. Curr. Opin. Struct. Biol. 57, 39–46 (2019).

611 24. Essen, L. O., Mailliet, J. & Hughes, J. The structure of a complete phytochrome sensory

28

612 module in the Pr ground state. Proc. Natl. Acad. Sci. U. S. A. 105, 14709–14714 (2008).

613 25. Fortunato, A. E., Annunziata, R., Jaubert, M., Bouly, J.-P. & Falciatore, A. Dealing with

614 light: the widespread and multitasking cryptochrome/photolyase family in photosynthetic

615 organisms. J. Plant Physiol. 172, 42–54 (2015).

616 26. Johnson, L. K., Alexander, H. & Brown, C. T. Re-assembly, quality evaluation, and

617 annotation of 678 microbial eukaryotic reference transcriptomes. Gigascience 8, (2019).

618 27. Pružinská, A., Tanner, G., Anders, I., Roca, M. & Hörtensteiner, S. Chlorophyll breakdown:

619 Pheophorbide a oxygenase is a Rieske-type iron–sulfur protein, encoded by the accelerated

620 cell death 1 gene. Proc. Natl. Acad. Sci. U. S. A. 100, 15259–15264 (2003).

621 28. Hauenstein, M., Christ, B., Das, A., Aubry, S. & Hörtensteiner, S. A Role for TIC55 as a

622 Hydroxylase of Phyllobilins, the Products of Chlorophyll Breakdown during Plant

623 Senescence. Plant Cell 28, 2510–2527 (2016).

624 29. Schöttler, M. A. & Tóth, S. Z. Photosynthetic complex stoichiometry dynamics in higher

625 plants: environmental acclimation and photosynthetic flux control. Front. Plant Sci. 5, 188

626 (2014).

627 30. Wu, S. et al. NON-YELLOWING2 (NYE2), a Close Paralog of NYE1, Plays a Positive

628 Role in Chlorophyll Degradation in Arabidopsis. Mol. Plant 9, 624–627 (2016).

629 31. Kunugi, M. et al. Evolution of Green Plants Accompanied Changes in Light-Harvesting

630 Systems. Plant Cell Physiol. 57, 1231–1243 (2016).

631 32. Swingley, W. D. et al. Characterization of photosystem I antenna proteins in the

632 prasinophyte Ostreococcus tauri. Biochim. Biophys. Acta 1797, 1458–1464 (2010).

633 33. Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics

634 30, 1236–1240 (2014).

29

635 34. Kim, Y. M. et al. dbCRY: a Web-based comparative and evolutionary genomics platform

636 for blue-light receptors. Database 2014, bau037 (2014).

637 35. O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status,

638 taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–45 (2016).

639 36. Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421

640 (2009).

641 37. Keeling, P. J. et al. The Marine Microbial Eukaryote Transcriptome Sequencing Project

642 (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through

643 transcriptome sequencing. PLoS Biol. 12, e1001889 (2014).

644 38. Mukougawa, K., Kanamoto, H., Kobayashi, T., Yokota, A. & Kohchi, T. Metabolic

645 engineering to produce phytochromes with phytochromobilin, phycocyanobilin, or

646 phycoerythrobilin chromophore in Escherichia coli. FEBS Lett. 580, 1333–1338 (2006).

647 39. Fushimi, K. et al. Rational conversion of chromophore selectivity of cyanobacteriochromes

648 to accept mammalian intrinsic biliverdin. Proc. Natl. Acad. Sci. U. S. A. 116, 8301–8309

649 (2019).

650 40. Hasegawa, M. et al. Molecular characterization of D. J. Biol. Chem. 293, 1713–1727

651 (2018).

652 41. Fushimi, K. et al. Photoconversion and Fluorescence Properties of a Red/Green-Type

653 Cyanobacteriochrome AM1_C0023g2 That Binds Not Only Phycocyanobilin But Also

654 Biliverdin. Front. Microbiol. 7, 588 (2016).

655 42. Narikawa, R. et al. A biliverdin-binding cyanobacteriochrome from the chlorophyll d-

656 bearing cyanobacterium Acaryochloris marina. Sci. Rep. 5, 7950 (2015).

657 43. Eichenberg, K. et al. Arabidopsis phytochromes C and E have different spectral

30

658 characteristics from those of phytochromes A and B. FEBS Lett. 470, 107–112 (2000).

659 44. Curtis, M. D. & Grossniklaus, U. A gateway cloning vector set for high-throughput

660 functional analysis of genes in planta. Plant Physiol. 133, 462–469 (2003).

661 45. Clough, S. J. & Bent, A. F. Floral dip: a simplified method forAgrobacterium-mediated

662 transformation ofArabidopsis thaliana. The Plant Journal vol. 16 735–743 (1998).

663 46. Earley, K. W. et al. Gateway-compatible vectors for plant functional genomics and

664 proteomics. Plant J. 45, 616–629 (2006).

665 47. Lewis, J. D., Abada, W., Ma, W., Guttman, D. S. & Desveaux, D. The HopZ family of

666 Pseudomonas syringae type III effectors require myristoylation for virulence and avirulence

667 functions in Arabidopsis thaliana. J. Bacteriol. 190, 2880–2891 (2008).

668 48. Suzuki, S., Yamaguchi, H., Nakajima, N. & Kawachi, M. Raphidocelis subcapitata

669 (=Pseudokirchneriella subcapitata) provides an insight into genome evolution and

670 environmental adaptations in the . Sci. Rep. 8, 8058 (2018).

671 49. Zimin, A. V. et al. Hybrid assembly of the large and highly repetitive genome of Aegilops

672 tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm. Genome

673 Research vol. 27 787–792 (2017).

674 50. Morisse, P., Marchet, C., Limasset, A., Lecroq, T. & Lefebvre, A. CONSENT: Scalable

675 long read self-correction and assembly polishing with multiple sequence alignment.

676 doi:10.1101/546630.

677 51. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina

678 sequence data. Bioinformatics 30, 2114–2120 (2014).

679 52. Di Genova, A., Di Genova, A., Ruz, G. A., Sagot, M.-F. & Maass, A. Fast-SG: an

680 alignment-free algorithm for hybrid assembly. GigaScience vol. 7 (2018).

31

681 53. Xu, G.-C. et al. LR_Gapcloser: a tiling path-based gap closer that uses long reads to

682 complete genome assembly. GigaScience vol. 8 (2019).

683 54. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler

684 transform. Bioinformatics 25, 1754–1760 (2009).

685 55. Walker, B. J. et al. Pilon: An Integrated Tool for Comprehensive Microbial Variant

686 Detection and Genome Assembly Improvement. PLoS ONE vol. 9 e112963 (2014).

687 56. Huerta-Cepas, J. et al. Fast Genome-Wide Functional Annotation through Orthology

688 Assignment by eggNOG-Mapper. Mol. Biol. Evol. 34, 2115–2122 (2017).

689 57. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–

690 3100 (2018).

691 58. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21

692 (2013).

693 59. Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome

694 Biol. 11, R106 (2010).

695

696

697 Acknowledgements

698 We thank Dr. S. Ota (NIES, Japan) for the FCM analyses. This research was partially supported

699 by the National BioResource Project (NBRP) from Japan Agency for Medical Research and

700 Development (AMED).

701

702 Author contributions

32

703 Y.M. and A.S. identified the DUC1 gene and analyzed its expression. S.Su., H.Y., M.K

704 determined the P. pravasolii genome. K.F., R.N. confirmed the absorbance spectra of PpPHY

705 and PpCRY. S.Shi., M.H., T.K. contributed the RNA analysis and complementation assay in

706 Arabidopsis. H.H. and S. Shi examined the intracellular localization of DUC1. K.Y., T.W., T.G.,

707 T.S. provided the metagenome data and the gDNA samples. Y.M., S.Su., K.F., S.Shi., Y.K.,

708 R.N., H.Y., M.K., M.M. wrote the manuscript.

709

710 Competing interests

711 The authors declare no competing interests.

712

713 ORCID for corresponding author

714 Minami Matsui

715 https://orcid.org/0000-0001-5162-2668

716

717

718 Figure legends

719 Figure 1. Metagenome sampling locations and chimeric photoreceptor, PpDUC1. (A)

720 Domain conservation between PpDUC1 and Arabidopsis CRY2 and PHYB. Domain structure of

721 PpDUC1 constructed with PAS: Per-Arnt-Sim, GAF: cGMP-specific phosphodiesterases,

722 adenylyl cyclases and FhlA, PHY: phytochrome, PHR: photolyase-homologous region, and

723 FAD: flavin adenine dinucleotide. (B) Metagenome sampling locations (red dots). (C) Light

33

724 microscopic image of coccoid cells of P. provasolii NIES-2983. Scale bar = 5 µm. (D)

725 Phylogeny of P. provasolii and comparison of gene content in prasinophytes. Maximum

726 likelihood (ML) tree of 126 orthologous proteins. The dataset was composed of 46,771 amino

727 acids from 16 species that have genomes available, including P. provasolii. Bootstrap

728 percentages (BPs) are shown on the nodes. Bold lines show BP = 100.

729

730 Figure 2. Photoconversion of PpPHY and PpCRY. (A) Domain structure of PpDUC. PpPHY

731 (magenta) and PpCRY (cyan) regions are shown with their molecular sizes and amino acid

732 positions. (B) Absorption spectra of the dark state (15Z–PCB, Po form) and the photoproduct

733 state (15E–PCB, Pfr form) of PpPHY expressed in C41_pKT271 (harboring the PCB synthetic

734 system). The normalized difference spectrum (dark state – photoproduct state) was calculated

735 from these absorption spectra. (C) Absorption spectra of the dark state (FADOX, Pb form) and the

736 photoproduct state (FAD•–, Puv form) of PpCRY expressed in C41_pG-KJE8 (harboring the

737 chaperone expression system). The normalized difference spectrum (dark state – photoproduct

738 state) was calculated from these absorption spectra. The absorption maxima of PpPHY and

739 PpCRY are reported in Table S1. Assignment of chromophores incorporated in PpPHY and

740 PpCRY was performed by comparison with each standard, which are reported in Figure S5.

741

742 Figure 3. Functional complementation of Arabidopsis cry2 and subcellular localization of

743 PpDUC1 in tobacco leaves. (A) Phenotypes of seven-day-old seedlings of wild type (WT, col-

744 4), cry2 (cry2-1), and transgenic lines with PpDUC1-HA overexpressed in the cry2 background,

745 grown in continuous blue light (5.2 μmol m−2 s−1). Scale bar = 1 mm. (B) Mean hypocotyl

34

746 lengths of 7-day-old seedlings ± s.d. (**P<0.01, n=10 to 19). (C) Mean hypocotyl lengths of 7-

747 day-old seedlings grown under dark conditions ± s.d. (n=14 to 21). (D) Schematic diagrams of

748 domains in the PpPHY, PpCRY and PpDUC1 constructs used in the N. benthamiana leaf

749 injection assay. (E) Immunoblot images of protein extract from N. benthamiana leaf tissue. Each

750 protein was detected using an α-GFP antibody. Mock, non-injected leaf. Black arrows denote

751 non-specific bands. Dot indicates the dye front. (F) Leaf injection assay in N. benthamiana

752 pavement cells transiently expressing PpPHY-GFP, PpCRY-GFP and PpDUC1-GFP. DIC,

753 differential interference contrast images (left side); GFP, fluorescence images (middle); and

754 Merge, merged images (right side). Scale bar = 50 μm.

755 Figure 4. Transcriptome analysis under monochromatic blue, orange and far-red light. (A)

756 Number of DEGs under each monochromatic light (blue, orange, far-red) against dark

757 conditions. The threshold is q-value < 0.05 and fold-change (FC) < 2/3 or FC > 1.5. (B) GO

758 enrichment analysis for 388 co-regulated DEGs under blue and orange light. (C) Expression

759 values for the light-harvesting chlorophyll protein complex I (LHCI) and prasinophyte-specific

760 LHC (LHCP).

761 Figure 5. Evolutionary scenario of PHY and pCRY in chlorophytes. The phylogeny of

762 representative chlorophytes and the domain structures of PHY and pCRY are shown. Tree

763 topology is based on previous studies. The NCBI-nr database was searched for PHYs and

764 pCRYs. pCRYs of Dolichomastix, Nephroselmis, and Prasinoderma were found in MMETSP

765 assemblies (MMETSP0033, 0034, and 0806, respectively). In Micromonas, M. pusilla possesses

766 PHY, but the gene is absent in M. commoda.

767

35

768 Figure 6. Phylogenetic trees of prasinophyte-specific light-harvesting chlorophyll proteins.

769 and pheophorbide a oxygenase. (A) Unlooted ML tree of prasinophyte-specific light-

770 harvesting chlorophyll proteins (Lhcp). The dataset was composed of 183 amino acids of 56

771 proteins, which were encoded in the prasinophyte genomes that were available in the

772 PhycoCosm database (https://phycocosm.jgi.doe.gov/phycocosm/home). (B) Unlooted ML tree

773 of pheophorbide a oxygenase (Pao). The dataset was composed of 211 amino acids of 34

774 proteins, which were found using an NCBI-nr database search. Bootstrap support values are

775 shown on each node. BP < 50 are not shown. Green color indicates P. provasolii proteins.

776

36

777 Table 1. Characteristics of genomes of 16 Viridiplantae species.

No. Chr/ Genome/ Assembly size Number of Species Strain Assemblies (Mb) genes

Prasinophytes

Pycnococcus provasolii NIES-2893 43 22.7 11,337

Micromonas commoda RCC299 17 21 10,163

Micromonas pusilla CCMP1545 21 22 10,707

Ostreococcus lucimarinus CCE9901 21 13.2 7,839

Ostreococcus tauri RCC4221 20 12.5 7,942

Bathycoccus prasinos RCC1105 19 15 7,847

Chloropicon primus CCMP1205 20 17.4 8,693

UTC clade

Chlamydomoans CC-503 17 111.1 18,115 reinhardtii

Volvox carteri Eve 19 131.2 15,194

Chlorella variabilis NC64A 12 46.2 9,833

Coccomyxa subelipsoidea C-169 20 48.8 9,947

Ulva mutabilis n/a 98.5 12,924

Streptophyta

Mesostigma viride NIES-296 5 441.8 24,431

Chlorokybus CCAC0220 n/a 85 9,300 atmophyticus

Klebsormidium nitens NIES-2285 n/a 103.9 16,244

Arabidopsis thaliana 5 135 27,416

778 n/a: not applicable

779

37

780 Figure 1

781 782 38

783 Figure 2

A

PAS GAF PHY PAS PAS PHR FAD

PpPHY PpCRY 70.2 kDa, 1 – 662 residues 72.2 kDa, 1039 – 1690 residues PpDUC1 181.4 kDa, 1 – 1690 residues

B c

0.63 0.63 expressed in C41_pKT271 – Dark expressed in C41_pG-KJE8 – Dark 0.54 – Photo 0.54 FAD•– – Photo 0.45 0.45 FADOX 15Z–PCB 0.36 15E–PCB 0.36

ABS 0.27 ABS 0.27 0.18 0.18 0.09 0.09 0.00 0.00

1.2 – Dark – Photo 1.2 0.8 0.8 0.4 0.4 ABS ABS D 0.0 D 0.0 -0.4 -0.4 Norm. Norm. -0.8 -0.8 -1.2 -1.2 300 400 500 600 700 800 300 400 500 600 700 Wavelength (nm) Wavelength (nm)

PpPHY PpCRY

784 785

39

786 Figure 3

A B C 12 20 10 15 10 5 2 0 0 Hpocotl lengt () Hpocotl lengt () WT cry2 WT cry2 WT cry2 PpDUC1-HA-OX/cry2 PpDUC1-HA-OX/cry2 PpDUC1-HA-OX/cry2

D F DIC GFP Merge PpPHY-GFP (PpPHY=1-3621 bp; 1207 aa) PASGAF PH PAS PAS PHR GFP 35Spro::PpPHY-GFP PpCY-GFP (PpCRY=3115-5070 bp; 652 aa) PHR FAD GFP

PpC-GFP (PpDUC=1-5070 bp; 1690 aa) 35Spro::PpCRY-GFP PAS GAF PH PAS PAS PHR FAD GFP

E FP trol 35Spro::PpDC-GFP FP GFP -G G - Con tor PpPHY-PpCRYPpDUC1MockGFPVec 250 kDa Mock(N.C.) 150 kDa

100 kDa 5 kDa 35Spro::GFP (P.C.)

50 kDa

Vector Control

787 788

40

789 Figure 4

A lue Orange 03000 Dark Blue 11320 Orae 5 7 24130 are 24140 311280 9 31120 49 02030 2130 005 034300 ar-red 5 034310 04460 B 06160 potosyntesisligtarvesting in potosystem (GO:00097) 101310 potosyntesis (GO:005979) 1260 potosyntesisligtreaction (GO:0094) 2040 potosyntesisligtarvesting (GO:000975) 20460

cloropyll biosyntetic process 2040 (GO:005995) porpyrin-containing compound 2040 biosyntetic process(GO:000779) 2000 response to far red ligt (GO:000) tetrapyrrole biosyntetic process 2010 (GO:0004) 21310 response to lo ligt intensity stimulus (GO:000945) 21320 generation of precursor metabolites and energy (GO:00009) 21330 regulation of cellular process (GO:0050794) 21340 0 2 4 6 8 0 10000 20000 30000 Normalized read counts 790 -log(p-value) 791

41

792 Figure 5

P Chlamydomonas C Gain Lss Tetraselmis P C P Chloroion C C C Pycnococcus C P P ehroselmis C

P iromonas P C C streoos P C athyos

P olihomasti C CYC

P Prasinoderma C

raidosis P C Cn ans iiiana P PAS GAF PHY His REC C PHR 793 FAD 794

42

795 Figure 6

PP101310 A PP290470 PP290490 PP291320 100 PP291310 PP290460 Pycnococcus provasolii PP290510 100 PP290450 100 PP291340 PP290500 PP291330 PP061670 64 _Bp1_7102 B prasinos 100 _sCC809_2_25131 Ostreococcus sp. CC809 73 89 _MpN32_1873 M pusilla CCMP1545 _MpC32_7636 M commoda CC299 PP152670 PP034310 54 P provasolii 100 PP044760 PP034300 99 _sCC809_2_54691 _sCC809_2_92478 Ostreococcus sp. CC809 _sCC809_2_95606 74 77 _s9901_3_27706 _s9901_3_30781 Ostreococcus lucimarinus CCE9901 82_s9901_3_50137 _s9901_3_30780 97 _s4221_3_71726 _s4221_3_73618 83 _s4221_3_73619 Ostreococcus tauri CC4221 61 _s4221_3_75570 82_MpC32_343 _MpC32_344 72_MpC32_6573 _MpC32_6574 Micromonas commoda CC299 _MpC32_9352 _MpN32_1357 _MpN32_1358 _MpN32_5439 _MpN32_1356 Micromonas pusilla CCMP1545 _MpN32_4278 63 _MpN32_4279 _MpN32_5438 56 _Bp1_3898 51 _Bp1_6646 8265 _Bp1_6903 Batycoccus prasinos _Bp1_6648 94 _Bp1_6647 59 _Bp1_290 B prasinos _MpN32_549 M pusilla CCMP1545 72 81 _MpC32_4774 M commoda CC299 _sCC809_2_89455 Ostreococcus sp. CC809 _s9901_3_29120 O lucimarinus CCE9901 82 _s4221_3_69519 O tauri CC4221 _MpN32_1322 M pusilla CCMP1545 _MpC32_6612 M commoda CC299

0.2

100 PNW82359.1 Chlamydomonas reinhardtii B PN06499.1 Tetrabaena socialis WP_026733903.1 ischerella sp. PCC 9605 BAY09420.1_Calothrix_sp._NIES-2098 100 WP_106546899.1 Chroococcidiopsis sp. CCAA_051 58566.1_Mooreabouillonii_PN 85 AU81130.1 Cylindrospermum licheniforme UEX B_2014 C CP 64 WP_012163293.1 caryochloris marina PP090660 Pycnococcus provasolii 100 XP_022234917.1 Bathycoccus prasinos 57 93 XP_003081710. Ostreococcus tauri XP_001420009.1 Ostreococcus lucimarinus CCE9901 XP_003055098.1 Micromonas pusilla CCMP1545 XP_002503849.1 Micromonas commoda Psps PZV24708.1 Snowella sp. 87 WP_155749972.1 Scytonema sp. UIC 10036 60 BAY65740.1 Calothrix brevissima NIES-22 WP_067770853.1 Nostoc sp. NIES-3756 65 WP_124155484.1 Okeania hirsuta 95 VP64014.1 Nodularia sp.

QFS52472.1 Nostoc sphaeroides CCNUC1 C 80 89 WP_023064779.1 Lynbya aestuarii WP_046281112.1 Limnoraphis robusta PP320220 Pycnococcus provasolii 84 XP_024380088.1 Physcomitrium patens PN37730.1 Trema orientale 100 70 PQQ14092.1 Prunus yedoensis . nudiflora 87 58 100 XP_003538163.1 Glycine max 100 XP_008812326.1 Phoenix dactylifera pp s Spps AQ92150.1 lebsormidium nitens

100 XP_001697234.1 Chlamydomonas reinhardtii 54 XP_001697235.1 Chlamydomonas reinhardtii BF88927.1 aphidocelis subcapitata

CEM11347.1 itrella brassicaformis_CCMP3155 Cps

0.3 796

43

Figures

Figure 1

Metagenome sampling locations and chimeric photoreceptor, PpDUC1. (A) Domain conservation between PpDUC1 and Arabidopsis CRY2 and PHYB. Domain structure of PpDUC1 constructed with PAS: Per-Arnt- Sim, GAF: cGMP-specic phosphodiesterases, adenylyl cyclases and FhlA, PHY: phytochrome, PHR: photolyase-homologous region, and FAD: avin adenine dinucleotide. (B) Metagenome sampling locations (red dots). (C) Light microscopic image of coccoid cells of P. provasolii 724 NIES-2983. Scale bar = 5 μm. (D) Phylogeny of P. provasolii and comparison of gene content in prasinophytes. Maximum likelihood (ML) tree of 126 orthologous proteins. The dataset was composed of 46,771 amino acids from 16 species that have genomes available, including P. provasolii. Bootstrap percentages (BPs) are shown on the nodes. Bold lines show BP = 100.

Figure 2

Photoconversion of PpPHY and PpCRY. (A) Domain structure of PpDUC. PpPHY (magenta) and PpCRY (cyan) regions are shown with their molecular sizes and amino acid positions. (B) Absorption spectra of the dark state (15Z–PCB, Po form) and the photoproduct state (15E–PCB, Pfr form) of PpPHY expressed in C41_pKT271 (harboring the PCB synthetic system). The normalized difference spectrum (dark state – photoproduct state) was calculated from these absorption spectra. (C) Absorption spectra of the dark state (FADOX735 , Pb form) and the photoproduct state (FAD•–, Puv form) of PpCRY expressed in C41_pG-KJE8 (harboring the chaperone expression system). The normalized difference spectrum (dark state – photoproduct state) was calculated from these absorption spectra. The absorption maxima of PpPHY and PpCRY are reported in Table S1. Assignment of chromophores incorporated in PpPHY and PpCRY was performed by comparison with each standard, which are reported in Figure S5.

Figure 3

Functional complementation of Arabidopsis cry2 and subcellular localization of PpDUC1 in tobacco leaves. (A) Phenotypes of seven-day-old seedlings of wild type (WT, col- 4), cry2 (cry2-1), and transgenic lines with PpDUC1-HA overexpressed in the cry2 background, grown in continuous blue light (5.2 μmol m−2 s−1). Scale bar = 1 mm. (B) Mean hypocotyl lengths of 7-day-old seedlings ± s.d. (**P<0.01, n=10 to 19). (C) Mean hypocotyl lengths of 7- day-old seedlings grown under dark conditions ± s.d. (n=14 to 21). (D) Schematic diagrams of domains in the PpPHY, PpCRY and PpDUC1 constructs used in the N. benthamiana leaf injection assay. (E) Immunoblot images of protein extract from N. benthamiana leaf tissue. Each protein was detected using an α-GFP antibody. Mock, non-injected leaf. Black arrows denote non-specic bands. Dot indicates the dye front. (F) Leaf injection assay in N. benthamiana pavement cells transiently expressing PpPHY-GFP, PpCRY-GFP and PpDUC1-GFP. DIC, differential interference contrast images (left side); GFP, uorescence images (middle); and Merge, merged images (right side). Scale bar = 50 μm.

Figure 4 Transcriptome analysis under monochromatic blue, orange and far-red light. (A) Number of DEGs under each monochromatic light (blue, orange, far-red) against dark conditions. The threshold is q-value < 0.05 and fold-change (FC) < 2/3 or FC > 1.5. (B) GO enrichment analysis for 388 co-regulated DEGs under blue and orange light. (C) Expression values for the light-harvesting chlorophyll protein complex I (LHCI) and prasinophyte-specic LHC (LHCP).

Figure 5

Evolutionary scenario of PHY and pCRY in chlorophytes. The phylogeny of representative chlorophytes and the domain structures of PHY and pCRY are shown. Tree topology is based on previous studies. The NCBI-nr database was searched for PHYs and pCRYs. pCRYs of Dolichomastix, Nephroselmis, and Prasinoderma were found in MMETSP assemblies (MMETSP0033, 0034, and 0806, respectively). In Micromonas, M. pusilla possesses PHY, but the gene is absent in M. commoda. Figure 6

Phylogenetic trees of prasinophyte-specic light-harvesting c hlorophyll proteins. and pheophorbide a oxygenase. (A) Unlooted ML tree of prasinophyte-specic light harvesting chlorophyll proteins (Lhcp). The dataset was composed of 183 amino acids of 56 proteins, which were encoded in the prasinophyte genomes that were available in the PhycoCosm database (https://phycocosm.jgi.doe.gov/phycocosm/home). (B) Unlooted ML tree of pheophorbide a oxygenase (Pao). The dataset was composed of 211 amino acids of 34 proteins, which were found using an NCBI-nr database search. Bootstrap support values are shown on each node. BP < 50 are not shown. Green color indicates P. provasolii proteins.

Supplementary Files

This is a list of supplementary les associated with this preprint. Click to download.

SupplementaryInformation.pdf SupplementaryData.xlsx