DNA Research 11, 69–81 (2004)

Molecular Evolution of PAS Domain-Containing Proteins of Filamentous Cyanobacteria Through Domain Shuffling and Domain Duplication

Rei Narikawa, Shinobu Okamoto, Masahiko Ikeuchi,∗ and Masayuki Ohmori

Department of Life Sciences (Biology), Graduate School of Arts and Sciences, University of Tokyo, Komaba, Meguro-ku, Tokyo 153-8902, Japan

(Received 15 December 2003; revised 11 March 2004) Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021

Abstract

When the entire genome of a filamentous heterocyst-forming N2-fixing cyanobacterium, Anabaena sp. PCC 7120 (Anabaena) was determined in 2001, a large number of PAS domains were detected in signal-transducing proteins. The draft genome sequence is also available for the cyanobacterium, Nostoc punctiforme strain ATCC 29133 (Nostoc), that is closely related to Anabaena. In this study, we extracted all PAS domains from the Nostoc genome sequence and analyzed them together with those of Anabaena. Clustering analysis of all the PAS domains gave many specific pairings, indicative of evolutionary conser- vations. Ortholog analysis of PAS-containing proteins showed composite multidomain architecture in some cases of conserved domains and domains of disagreement between the two species. Further inspection of the domains of disagreement allowed us to trace them back in evolution. Thus, multidomain proteins could have been generated by duplication or shuffling in these cyanobacteria. The conserved PAS domains in the orthologous proteins were analyzed by structural fitting to the known PAS domains. We detected several subclasses with unique sequence features, which will be the target of experimental analysis. Key words: cyanobacterium; PAS domain; domain shuffling; ortholog pair; molecular evolution

12,13 1. Introduction heme-binding O2 sensor PAS domain, voltage sen- sor PAS domain,14 FMN-binding blue light sensor PAS Signal-transducing proteins usually have sensor mod- domain,15 and small compound-binding PAS domain.16 1 2 ules and response modules. In bacteria, PAS domains, However, there is still a huge number of PAS domains 3 4 5 GAF domains, CBS domains, HAMP domains and whose structure or function remains to be solved. 6 FHA domains are mainly employed as the sensor Kaneko et al. reported the complete sequence of 7 modules, while (HK) domains, HPt the entire genome of a filamentous heterocyst-forming 7 7 domains, response regulator (RR) domains, adenylate N2-fixing cyanobacterium, Anabaena sp. strain PCC 8 1 1 cyclase domains, GGDEF domains, EAL domains, 7120 (Anabaena).17 Later, domain analysis revealed that 9 Ser/Thr kinase domains and DNA-binding domains are the Anabaena genome encodes a number of signal- mainly used as the response modules. transducing proteins including HK and RR proteins of The PAS domain is one of the important signaling the two-component regulatory system, PAS domain pro- modules that monitors changes in light, redox poten- teins, GAF domain proteins, and Ser/Thr-type protein tial, oxygen, small ligands, or the overall energy level kinase proteins.18 It was found that not only the number 10 of a cell. It is widely distributed from bacteria to of such proteins but also the number of the signaling do- higher plants and animals. The PAS domain super- mains in single proteins is extremely large compared with family is highly diverse in sequence and in length but other bacteria. In the total genome, 87 GAF domains is conserved in the basic three-dimensional (3D) struc- were detected in 62 putative proteins, while 140 PAS do- ture, consisting of PAS-core motif, helical connector and mains were detected in 59 proteins. Moreover, many sig- PAC motif. So far, five distinct structures are resolved: naling proteins in Anabaena were characterized as mul- 11 coumaric acid-binding blue light sensor PAS domain, tidomain proteins. For example, one gene (all2095)was Communicated by Satoshi Tabata predicted to encode ten PAS domains in tandem arrange- ∗ To whom correspondence should be addressed. Tel. +81-3- ment in addition to an HK domain. Another example, 5454-6641, Fax. +81-3-5454-4337, E-mail: [email protected] all0729, was predicted to encode three PAS domains, tokyo.ac.jp 70 Shuffling of PAS Domains in Cyanobacteria [Vol. 11, three GAF domains, one HK domain and two RR do- are categorized as Nostoc-specific proteins, and had lit- mains. To understand the complexity, we performed clus- tle effect on the result of the analysis. Assembly error tering analysis of the GAF and PAS domains within the in the draft sequence may result in generation of hy- Anabaena genome. It was found that only a few of them brid ORFs. However, detailed comparison of the PAS could be grouped together, while most of others were domain-containing proteins did not show such an exam- placed as single-component clades. The PAS domains in ple except for Npun0339 (see Results and Discussion), particular seem to be highly diverged according to the which is, again, specific to Nostoc. In conclusion, the clustering analysis within the Anabaena genome. This draft Nostoc sequence seems to be sufficient for the bioin- suggests that, in most cases, even the tandemly arranged formatic analysis presented here. PAS domains were not generated by simple duplication. To study the evolutionary and physiological role of the 2.2. Informatics multiple PAS domains, we must take into consideration Homology analysis was performed using the BLAST the genome information of closely related cyanobacterial (NCBI-BLAST, version 2.1–2.2) and PSI-BLAST (ver- species. 20

sion 2.1.1) programs running locally or on the Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021 Recently, genome projects have been finished or are Web (non-redundant GenBank, SwissProt in NCBI in progress for more than ten cyanobacterial species. Of and GenomeNet and Cyanobase). Motif analysis these, Nostoc punctiforme strain ATCC 29133 (Nostoc) is was performed by Pfam21 and SMART22 searches lo- our choice for computational analysis in comparison with cally or on the Web (http:/www.sanger.ac.uk/Pfam/, Anabaena. It is also a filamentous heterocyst-forming http:/smart.embl-heidelberg.de/). Multiple alignments N2-fixing cyanobacterium. Nostoc has additional proper- were performed using CLUSTAL X.23 Phylogenetic trees ties of symbiosis with plants and development of akinetes were constructed using TreeView.24 Harplot analysis was and hormogonia. Phylogenetically, it is closely related to performed with GENETYXr -MAC Ver. 10.1 (SOFT- Anabaena according to sequence analysis of rRNA and WARE DEVELOPMENT CO., LTD.). Secondary struc- many protein coding genes, although significant differ- ture was predicted with PSIPRED.25 ences in gene composition and cellular function such as symbiosis have been documented.19 In this study, we ex- 2.3. Detection of PAS domains tracted all PAS domains from the draft genome sequence We previously detected 140 PAS domains from the of Nostoc and analyzed them together with Anabaena. 18 Detailed comparison between Nostoc and Anabaena gave Anabaena genome and constructed a sequence align- us deeper insights into the evolution of signaling domains ment. Based on this alignment, we built a custom pro- and proteins in cyanobacteria. Based on these analyses, file HMM and extracted the PAS domains from An- abaena and Nostoc using the default search parameters we identified orthologous pairs of domains or genes as 26 well as unique ones, which might have been generated by of HMMER (version 2.2) with the cut off E-value of duplication or shuffling of domains, sets of domains, or 10. The custom profile HMM and the extracted PAS genes. We further analyzed the orthologous PAS domains sequences are available at the web site http://bio.c.u- by structural fitting to the known domains. tokyo.ac.jp/labs/ikeuchi/narikawa-resource. Moreover, we considered a PAS domain to be a false-positive and manually removed it whenever a significant part of the 2. Materials and Methods domain was also assigned to other known motifs hav- ing higher E-values. PAS domains from the genomes of 2.1. Sequences 98 other prokaryote species were automatically detected The whole set of potential proteins deduced from when the E-value was over 0.1. the complete genome of Anabaena was obtained from CyanoBase (http://www.kazusa.or.jp/cyano/cyano. 2.4. Clustering analysis html). A similar set of proteins from the draft sequence of Nostoc was obtained from NCBI (updated on 07- We performed bootstrap trials based on the multiple NOV-2002). Sequences from other species were also ob- alignment of PAS domains from Anabaena and Nostoc. tained from NCBI. The draft Nostoc sequence may con- We regarded a group of PAS domains as distinct sub- > tain some sequence errors and assembly errors. How- classes when clustered over the cutoff value ( 500) of ever, we detected only three frameshift mutations in the the bootstrap trials (1000). PAS domain-containing proteins of Nostoc. One of them, Npun3208-3209, is not likely due to a sequencing er- 2.5. Definition of ortholog ror because the mutation was a tandem duplication of A specific definition of the ortholog is needed for mul- 11 bp (see Results and Discussion, 3.4.2). The others tidomain signaling proteins. Since there is large varia- (Npun5710-5709 and Npun2261-2260) may be caused by tion in the conservation and composition of domains of sequencing error or just pseudogenes. Regardless, they such proteins, a BLAST search of whole proteins often No. 2] R. Narikawa et al. 71 gives confusing results. Here, we defined the ortholog mains. In total, 288 PAS domains were clustered into of the multidomain signaling proteins as follows. First, 64 subclasses, which might have kept common roles, if we divided domains in a single protein into the sensor any, during evolution. Of these, subclasses #10 and #27 modules (PAS, GAF, CBS, HAMP and FHA domains) are putative flavin-binding and heme-binding subclasses, and response modules (HK, HPt, RR, adenylate cyclase, respectively, which possibly serve as a blue light sensor GGDEF, EAL, Ser/Thr kinase and DNA-binding do- and an oxygen sensor. Besides these, we could not de- mains). Second, we detected the orthologous domains tect any subclasses with specific conservation to those of that show a reciprocal best hit between the two species well-known function. in BLAST search and selected as orthologs the proteins that carry the orthologous domains for both the sensor 3.3. Ortholog analysis of PAS-containing proteins modules and the response modules. In this communi- As we previously noted,18 PAS domains, often present cation, we focused on the orthologs that carry the PAS in multiple copies, are mostly found together with one domains as the orthologous domain (Fig. 1). or more domains of another category of signal transduc-

tion in a single protein, leading to a predominance of Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021 3. Results and Discussion unique multidomain proteins in Anabaena and Nostoc. To extend our clustering analysis of the PAS domains, 3.1. PAS-containing proteins we compared the whole domain architecture of proteins By the HMM search with the custom-made profile de- containing those pairs of PAS domains between the two rived from the 140 PAS domains of Anabaena, we de- species by BLAST search. First, we detected 22 pairs tected 180 PAS domains in 84 ORFs in the draft sequence of orthologs that highly resemble each other throughout of Nostoc and 143 PAS domains in 61 ORFs in Anabaena the entire polypeptide (Fig. 1A). Second, we detected (Table 1). There were three more PAS domains found 14 nearly orthologous pairs of Anabaena and Nostoc that in Anabaena in the present study than in our previous highly resemble each other in most but not all domains 18 study. We newly identified PAS domains of subclass (Fig. 1B). To help the domain comparison, we connected #40 in All3447, subclass #47 in Alr3170 and subclass the homologous domains between the pairs with solid #64 in All4896 (Fig. 1A). These three PAS domains have lines in Fig. 1B. In any case, one can see more than counterparts in Nostoc. Although they were not clearly one extra domain in either Anabaena or Nostoc coun- picked up by a conventional Pfam search, we assumed terpart. Hereafter, we call such extra domains “domains here that they belong to new subclasses of the PAS su- of disagreement.” For an extreme example, All0729 has perfamily. The total number of the PAS domains in An- one extra PAS domain of subclass #5, while the Nos- abaena and Nostoc is extremely large in comparison with toc counterpart Npun6145 has three extra PAS domains other bacteria and archea. For example, Escherichia coli (subclasses #2, #31, and #45) in addition to two com- has 11 PAS domains in 9 ORFs, Bacillus subtilus has mon PAS domains (subclasses #1 and #19) as well as 12 PAS domains in 10 ORFs, Archaeoglobus fulgidis has other domains (Fig. 1B). This suggests that they have ex- 32 PAS domains in 17 ORFs and Synechocystis sp. PCC perienced in evolution insertion or deletion many times 10 6803 (Synechocystis) has 47 PAS domains in 17 ORFs. after branching of the two species, while keeping their basic domain architecture as a nearly orthologous pro- 3.2. Clustering analysis of PAS domains tein. Although we cannot distinguish between insertions We clustered 323 PAS domains of Anabaena and Nos- or deletions with confidence, we must mention that both toc over 50% of the bootstrap trials. Sixty-four clus- Anabaena and Nostoc have an extremely large number ters were detected from 288 PAS domains and 35 orphan of PAS domains compared with other organisms. This PAS domains (Table 1). In this paper, we tentatively means that PAS domains have strong tendency to in- defined each of these 64 clusters as a distinct subclass, crease in number by duplication at least in these species. to which we assigned an ID number (#1–64) (see Fig. 1 Here, we provisionally assumed that the domains of dis- and 2). The orphan PAS domains might have diverged agreement have mostly resulted from insertion due to do- too much during evolution to be categorized. On the main duplication rather than deletion. other hand, when only Anabaena PAS domains were an- From Fig. 1B, we can identify domains of disagree- alyzed, 25 clusters from 96 PAS domains and 47 orphan ment in the PAS-containing orthologous proteins as fol- PAS domains were detected (Table 1, see also Ohmori et lows: 22 PAS domains, 5 GAF domains and 5 RR do- al. 2001). Similarly, we found 48 orphan PAS domains mains. These PAS and GAF domains of disagreement in Nostoc when analyzed only within this species. By are all found next to other PAS or GAF domains, while contrast, only 35 PAS domains remained orphans, when RR domains of disagreement are present either at the both Anabaena and Nostoc were analyzed together. In N-terminus (Npun6157) or at the very C-terminus next other words, we could obtain additional 60 PAS domains to the HK domain (Npun1855, Alr2428 and Npun2227). out of the above-mentioned 47 plus 48 orphan PAS do- There is no disagreement in domains of HK or other sig- 72 Shuffling of PAS Domains in Cyanobacteria [Vol. 11,

(A)

All1219 1 Alr3511 41 Npun2047 1 Npun6122 41

Alr5272 1 All4502 43 Npun2718 1 Npun2599 43

All0182 2 All1175 44 CBS domain Npun2175 2 Npun6323 44 HAMP domain All1846 2 All0707 63 FHA domain Npun1355 2 Npun2709 63 1 2 PAS domain Alr4564 6 All4896 64 ••• with subclass ID

Npun3104 6 Npun2522 64 64 Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021 orphan PAS domain All1012 Alr1968 7 1 3 GAF domain Npun2494 7 Npun4393 1 3 histidine kinase domain

All1279 7 Alr1966 15 response regulator domain Npun4844 7 Npun4391 15 HPt domain GGDEF domain All1904 17 Alr3442 22 1 4 EAL domain Npun3326 17 Npun4379 22 1 4 adenylate cyclase domain Alr2266 17 Alr0428 11 9 24 OmpR DNA binding domain Npun2022 17 Npun4097 11 9 24 LuxR DNA binding domain CRP DNA binding domain All3563 28 Alr3170 47 10 25 32 transmembrane domain Npun2416 28 Npun5680 47 10 25 32

All3447 40 Alr3761 29 Npun6262 40 Npun3253 29

(B)

All0542 15 All0729 19 5 1 Npun1855 15 Npun6145 4519 1 31 2

All5327 26 All1804 4 20 37

Npun2227 26 Npun1360 4 8 20 37

All4897 1 All1914 13 33 46 3 Npun6157 7 1 Npun5547 6 13 30 33 3

All1145 14 All2379 34 21 21 1 3 Npun4641 2114 Npun6860 34 21 8 3

All3564 1 All2875 16 10 3 1 5 Npun2417 12 1 19 Npun0349 16 10 3

Alr2428 36 22 27 All0824 4 2239 Npun1874 27 Npun1852 4 223 39

All1280 23 42 All5210 1 1 1 15 Npun4845 23 42 Npun3208-3209 1 1 4 1 15

Figure 1. Domain architecture alignment between the orthologous PAS proteins of Anabaena and Nostoc. A: complete orthologous pairs without domains of disagreement. B: orthologous pairs containing domains of disagreement. Proteins from N-terminus to C-terminus are shown from left to right. Domains are depicted mostly as the number of amino acid residues. PAS domains of the same subclass have same color and number. “Orthologous domains” between Anabaena and Nostoc are shown by connection of lines in panel B. No. 2] R. Narikawa et al. 73

Anabaena Nostoc

All3767 2 Npun3200 54 30 61 2

Alr3092 25 59 20 4 2 Npun0711 1 2

All1651 11 Npun0705 3

All2035 11 Npun1056 3

Alr2682 1 Npun0993 3

Alr3120 55 Npun4667 36 3

All2095 5 23 48 50 12 49 1 Npun0600 6

Alr1229 2 1 2 45 1 10 31 Npun1846 6

All2239 2 54 1 5 53 49 55 2 50 Npun5550 6

Alr2279 9 5 Npun5510 6 Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021 All0978 59 1 1 Npun3327

Alr3225 1 12 1 8 1 Npun3568 12 62

All5173 53 2 Npun3206 8

All4097 2 55 50 Npun6058 1

Alr2137 13 18 1 Npun5627 38 9 35 13 1418 14

All3275 1 Npun5709 38 914 18 9 18 58 Npun5710

Alr2481 56 35 8 Npun2786 12 49 49 13 7

All2897 48 51 Npun5209 60 58 Npun3538 1

All2094 7 Npun2261 20 Npun4222 60

All1716 Npun2260 1 Npun5975 52 8

All0219 Npun4750 11 Npun5672 1 2 53

Alr2123 Npun5708 11 Npun3117 16 13 48

Alr0546 Npun4199 1 Npun3180 4 16

All3987 Npun3192 5 Npun5511 56 1 2 57

All3985 52 Npun2841 24 Npun1562 61 57

Npun5039 19 Npun2595 46 48 51 4 CBS domain histidine kinase domain Npun5670 62 HAMP domain response regulator domain Npun0707 54 5 51

1 HPt domain Npun3207 12 Npun4994 4 51 4 1 2 PAS domain ••• methylesterase domain Npun1005 10 with subclass ID Npun0900 1 6 9 52 8 methyltransferase domain 64 Npun6544 Npun0937 orphan PAS domain Ser/Thr protein kinase domain Npun6943 Npun0339 GAF domain AraC DNA binding domain GGDEF domain transmembrane domain Npun2917 EAL domain

Figure 2. Domain architecture of orphan PAS proteins of Anabaena and Nostoc. nal output domains (GGDEF, EAL, adenylate cyclase Synechocystis (Sll0698, Sll0337, Slr0359 and Slr0759, re- and DNA-binding domains). Thus, it is apparent that spectively). So these conserved pairs appear to share a insertion of signaling domains has been strongly biased common function widely among cyanobacteria. For ex- toward the domain categories and location in signaling ample, Alr3511 and Npun6122 are homologs of the global proteins even though insertion must have taken place at stress sensor in Synechocystis (Sll0698 or Hik33)27 and in the DNA level. At the moment, it is not clear whether Synechococcus sp. PCC 7942 (NblS).28 They have only or not those domains still retain their original function. one PAS domain of subclass #41 as a possible sensory do- Of the 36 orthologous pairs of the PAS-containing pro- main. All4502 and Npun2599 are homologs of the phos- teins of Anabaena and Nostoc (Fig. 1), we could see that phate sensor, which is involved in the induction of alka- only 4 pairs (Alr3511/Npun6122, All4502/Npun2599, line phosphatase under phosphate-limiting conditions in Alr3170/Npun5680 and All2875/Npun0349) also have or- Synechocystis (Sll0337, Hik7 or PhoR).29 They have only thologs in the unicellular N2-nonfixing cyanobacterium, one PAS domain of subclass #43 as a possible phospate- 74 Shuffling of PAS Domains in Cyanobacteria [Vol. 11,

Table 1. Clustered and orphan PAS domains of Anabaena and Nostoc.

Number of clustered Genome size PAS PAS orphan PAS domains (Mb) proteins domains PAS domains (clusters)

Anabaena 6.4* 61 143 96 (25) 47 # Nostoc 9.25-9.5 84 180 132 (33) 48 combined 145 323 288 (64) 35

*: Kaneko et al. 2001, #: Meeks et al. 2001, Clustered and orphan PAS domains were defined with regard to the cutoff value (500) for bootstrap trials (1000). Each cluster was referred as “subclass” of the PAS superfamily. Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021 sensing domain. These PAS domains of subclass #41 and fore the branching of Anabaena and Nostoc. Npun1852 #43 show extremely high sequence identity of 90% and has an additional PAS domain (subclass #3), which is 87%, respectively, between Anabaena and Nostoc, indica- absent from All0824 (Fig. 3A); it is assumed to have tive of their essential roles. Both Alr3170/Npun5680 and jumped into its current position after the branching of All2875/Npun0349 contain PAS domains of #10 subfam- the two species (Fig. 3C). ily, which possibly binds flavin. These may serve as a On the other hand, All2379 has two PAS do- blue light sensor widely among cyanobacteria. Another mains of subclass #21, while its orthologous counter- 32 pairs, whose counterparts are not present in Syne- part, Npun6860, has one PAS domain of this subclass chocystis, may be involved in N2-fixation or development (Fig. 4A). The two PAS#21 domains of All2379 show of filamentous cells. On the other hand, we could not 57% identity to each other at the amino acid level, while find any Nostoc counterparts for the 25 Anabaena PAS no other PAS domains in the Anabaena genome are proteins or any Anabaena counterparts for the 48 Nos- more homologous to them. Harplot and sequence align- toc PAS proteins in Anabaena (Fig. 2). They may have ment suggested the direct duplication of a 384-bp seg- specific roles unique to each species. ment, which harbors the entire PAS domain of 128 amino acid residues (not shown). Since such a repeat is not 3.4. Domain tracing of orthologous proteins present in Npun6860, duplication is assumed to have 3.4.1. Duplication of single PAS domains taken place after the branching of Anabaena and Nostoc. Notably, the two PAS domains of All2379 have not di- As mentioned above, domains of disagreement can verged equally from the homologous PAS of Npun6860. be ascribed to propagation of single PAS domains by Namely, PAS#21 of Npun6860 shows 53% identity at amplification and rearrangement of DNA fragments in the amino acid level with the first PAS#21 of All2379 genomes. According to the clustering tree of PAS do- and 82% with the second PAS#21 (Fig. 4B). Probably, mains of Anabaena and Nostoc, one can trace them back after tandem duplication, the first domain of PAS#21 to possible origins. One of the unambiguous examples in All2379 was quickly diverged in evolution, while the of such events is the tandem arrangement of closely re- second PAS#21 was conserved. We also calculated the lated PAS domains. Typical examples are found in three sequence conservation of other PAS domains of All2379 proteins. All0824/Npun1852 and All2379 carry tandem with respect to those of Npun6860: 80% for PAS#34 and repeats of PAS domains of subclass #21 and #2, respec- 74% for PAS#3 (Fig. 4B). These data imply that the tively (Fig. 3A and Fig. 4A). Clearly, the two PAS do- second PAS#21 (and possibly PAS#34) plays a much mains of each repeat are most homologous to each other more important role than the other two PAS domains, within the whole genome, although those subclasses in- although their precise role has not yet been elucidated. clude far more than these two members. For example, Further, All2379 and Npun6860 harbor other PAS do- two PAS#2 domains of All0824 shows sequence identity mains of different subclasses (PAS#1 in All2379 and of 63% at the amino acid level (Fig. 3B), while the second PAS#8 in Npun6860), which were gained in evolution af- closest one (a single PAS domain of All0182, see Fig. 1A) ter branching of Anabaena and Nostoc (Fig. 4C). Based shows only 38–39% identity. This is also true for the two on the clustering of PAS domains, we assumed that PAS#2 domains of the Nostoc counterpart, Npun1852. PAS#1 of All2379 was derived by duplication of the 6th In both genes, direct tandem repeat of 381 bp covering PAS domain of Alr1229 (Fig. 2). We also hypothesize 127 amino acid residues was recognized by harplot anal- that PAS#8 of Npun6860 could be derived from the sec- ysis (not shown). This clearly shows that the PAS#2 do- ond PAS domain of Npun1360 (Fig. 1B) or the 5th PAS mains were amplified by tandem duplication at the DNA domain of Npun0900 (Fig. 2). Thus, most of domains level in a common ancestor of All0824 and Npun1852 be- No. 2] R. Narikawa et al. 75

(A) (A) All0824 4 2239 All2379 34 21 21 1 3

Npun1852 4 223 39 Npun6860 34 21 8 3 (B) (B) All0824 All2379 4 2239 34 21 21 1 3 439 100 17 15 19 77 22

17 100 63 14 Npun6860 34 All0824 15 63 100 11 80 16 19 17 22 21 16 53 82 19 17

19 14 11 100 Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021 8 18 20 16 19 14 3 100 22 13 18 22 74

82 100 (%) 77 (%) (C) Ancestor 4392 (C) Ancestor Duplication 34 21 3

Duplication All0824 43922 All2379 Insertion Insertion 34 21 21 1 3 Npun1852 4 223 39 Npun6860 Insertion 34 21 8 3 Figure 3. Orthologous pair of All0824 and Npun1852. A: pairwise representation of the domain architecture is the same as shown Figure 4. Orthologous pair of All2379 and Npun6860. A: in Fig. 1 except duplicated but diverged domains are shown by pair-wise representation of the domain architecture as shown connection of dotted lines. B: domain similarity within All0824 in Fig. 3. B: domain similarity between All2379 and Npun6860 displayed as domain-based harplot. Sequence identity between displayed as a domain-based harplot. Sequence identity be- related domains of the two proteins is shown as a percent in tween related domains of the two proteins is shown as a per- each box that corresponds to the domains. Gray boxes indi- cent in each box that corresponds to the domains. Gray boxes cate over 40% identity. Note that not all boxes are shown, indicate over 40% identity. Note that not all boxes are shown, as similarity between domains of different categories was not as similarity between domains of different categories was not considered. C: proposed pathway in evolution to generate the considered. C: proposed pathway in evolution to generate the current genes from a common ancestor. current genes from a common ancestor. of disagreement in all the pairs of orthologous proteins 3.4.2. Duplication of domain sets could be explained as independent insertions of single Further analysis of domain architecture revealed that PAS domains. not only single PAS domains but also a set of domains There are 35 orphan PAS domains, as mentioned have been duplicated and transferred during evolution. above, which have been placed outside of the defined Again, the best evidence is the tandem duplication of subclasses (bootstrap values lower than 50%). Anabaena domains in All5210 (Fig. 5A). The second PAS domain and Nostoc have 15 and 20 such orphan PAS domains, (subclass #1) is homologous to the third PAS domain of respectively. Most of them are found in species-specific this All5210 (44% identity at the amino acid level), and proteins (Fig. 2, PAS domains in black color), while only the first GAF domain is most homologous to the sec- one was detected as a domain of disagreement in one of ond GAF domain (41%) in the whole Anabaena genome the orthologous proteins (Fig. 1B, All1280). Orphan PAS (Fig. 5B). All5210 also has another PAS domain of sub- domains must also have been generated after the branch- class #1 (the first PAS) and GAF domain (the third ing of Anabaena and Nostoc. However, it is difficult to one) but their similarity to the duplicated domains was trace them back to their origins due to very low sequence much lower; identity between the first and the second conservation. PAS domains is only 21% (Fig. 5B). These findings in- dicate that the direct duplication of a DNA fragment 76 Shuffling of PAS Domains in Cyanobacteria [Vol. 11,

(A) All5210 1 1 1 15

Npun3208-3209 1 1 4 1 15 (B) All5210 11 115 1115 11 Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021 100 21 17 14

21100 44 13

100 41 16 All5210 17 44 100 15

14 13 15 100

41 100 13

16 13 100

100

100 (%) (C) Ancestor 11

Duplication

Intermediate 11 1

Insertion All5210 11 115 GACGCAAATTT Duplication Insertion GACGCAAATTTGACGCAAATTT Npun3208-3209 11415 1

Figure 5. Orthologous pair of All5210 and Npun3208-3209. A: pair-wise representation of the domain architecture as shown in Fig. 3. B: domain similarity within All5210 displayed as a domain-based harplot. Sequence identity between related domains of the two proteins is shown as a percent in each box that corresponds to the domains. Gray boxes indicate over 40% identity. Note that not all boxes are shown, as similarity between domains of different categories was not considered. C: proposed pathway in evolution to generate the current genes from a common ancestor. No. 2] R. Narikawa et al. 77 corresponding to 302 amino acid residues with almost no Further strong evidence comes from the complete intervening space, although the PAS and GAF domains duplication of adenylate cyclase, CyaB1 (Alr2266 and of the second part was later interrupted by the inser- Npun2022) and CyaB2 (All1904 and Npun3326).30 They tion of PAS#15. Essentially the same domain structure have two GAF domains, one PAS#17 domain and a cat- is found in the Nostoc counterpart except for the fol- alytic domain of adenylate cyclase. Biochemical and mu- lowing points. PAS#4 was inserted between the PAS#1 tational analyses of Anabaena CyaB1 have revealed that and GAF domains and the whole ORF was interrupted the second GAF domain works as a cyclic GMP-binding by a frameshift mutation due to tandem duplication of domain, while only one PAS domain has not yet been an- 11 bp (GACGCAAATTT) (Fig. 5C). It was inserted in alyzed in detail.31 In this case, the two genes are signif- the middle of the PAS#1 domain. As a result, a puta- icantly homologous to each other even at the nucleotide tive functional ORF was divided into two inactive ORFs, level. However, there was no more similarity in the flank- npun3208 and npun3209. ing regions between the two loci. This suggests that the Multiple copies of DNA segments encompassing a set whole cyaB gene was duplicated but inserted into a dif- of domains can be seen in both genomes. One typical ferent position of the genome before the branching of member carries a set of PAS#3, HK and RR domains Anabaena and Nostoc. Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021 (denoted PAS#3-HK-RR domains), which are found at the C-terminal end in three proteins in Anabaena 3.5. Conserved PAS domains in orthologous proteins (All1914, Alr1968 and All2379) and 6 proteins in Nostoc There are 61 pairs of 122 conserved PAS domains in (Npun5547, Npun4393, Npun0705, Npun0993, Npun6860 the orthologous proteins between Anabaena and Nostoc. and Npun1056). Six of them were already listed as They consist of all the PAS domains in Fig. 1A and the three distinct orthologous pairs in Fig. 1, while the conserved PAS domains in Fig. 1B (shown by connec- other three were orphan proteins (Npun0705, Npun0993, tion of lines). We call these conserved pairs “orthologous and Npun1056) as shown in Fig. 2. All of them share PAS domains.” These PAS domains of 61 pairs are our the PAS#3-HK-RR domains of about 510 amino acid choice for further bioinformatic analysis. They showed residues in total that show identity from 53% to 78% to sequence conservation ranging from 36% to 90% identity. each other. Further, Nostoc has one more protein of this For example, four orthologous PAS domains of Alr3170 member, Npun4667, which has a homologous region of and Npun5680 (see Fig. 1A) show sequence identity of PAS#3 and HK domains but lacks the C-terminal RR 36%, 66%, 77% and 67%, respectively. In this case, the domain. Clustering analysis of these ten proteins based first PAS domain may not have a critical role judging on these homologous domains did not give a clear phy- from its low conservation. Notably, the second PAS do- logenetic relationship except for the three orthologous main may bind flavin based on sequence similarity (see pairs. Perhaps the PAS#3-HK-RR domains were du- below), while no clear features can be found in the other plicated many times before and after the branching of two. However, we cannot exclude the possibility that the Anabaena and Nostoc. Interestingly, nine of them have latter two PAS domains also retain a role for signal per- one or many PAS domains as a sole signal input do- ception, since they are highly conserved between the two main. To gain insights into their functionality, we com- species. pared the sequence similarity of those PAS domains be- Here we performed mapping analysis to detect domains tween the orthologous pairs. In the case of All1914 and with novel function apart from similarity search. The 3D Npun5547 (see Fig. 1), the first two PAS domains (sub- structure of a few known PAS domains shows a com- classes #13 and #33) show identities of only 50% and mon clam-shaped fold, consisting of the flat side of five 58%, respectively, whereas the third one (subclass #3) β-strands and the convex side of five α-helices.10 The in- shows 64%. This suggests that the PAS#3 of the PAS#3- ner space may adaptively fit with various ligands such HK-RR domains may play a critical role for signal input as FAD, FMN, heme or coumaric acid. We adopted the in All1914 and Npun5547. On the other hand, Alr1968 common structure naming system of Taylor et al. (1999). and Npun4393 have two PAS domains (subclass #1 and It is known that a cysteine residue in Eα covalently binds #3) that show similar conservation between Anabaena to FMN during photocycle of the blue light sensor in and Nostoc (72% identity). All2379 and Npun6860 have fern, Phy3.15 A histidine residue in a loop between Eα five and four PAS domains as mentioned in previous sec- and Fα coordinates the iron atom of the heme of O2 sen- tion and the PAS#21 showed highest conservation be- sor, FixL protein of Rhizobium species.12,13 A cysteine tween Anabaena and Nostoc (82% identity in Fig. 4B). residue in Fα covalently binds to coumaric acid in the These features suggest that the PAS#3 of the PAS#3- photoactive yellow proteins.11 Based on the 3D structure, HK-RR domains is not necessarily a critical sensor do- these ligand-binding residues are mapped in a narrow re- main for the HK proteins. Another example can be seen gion, which we call Eα-loop-Fα region. Extending this in a set of PAS#2, HK and RR domains in both genomes finding, we predicted the secondary structure of all of (All0182/Npun2175 and All1846/Npun1355 in Fig. 1A the orthologous PAS domains and found that 30 pairs of and Alr3902, Npun3200 and Npun0711 in Fig. 2). 78 Shuffling of PAS Domains in Cyanobacteria [Vol. 11,

(A)

Phot1_2/2 0.1 944

Phot1_1/2

743 Npun5680_2/4 758

Alr3170_2/4 Subclass #10 977 Alr1229_8/9

1000 673 Npun0349_2/3

964 Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021 All2875_2/5

Npun1005_2/2

Npun1874_1/1 Subclass #27 Alr2428_3/3 (Outgroup) (B)

Dα Eα Phot1 1/2 KEVVGRNCRFLQGSGT Phot1 2/2 EEILGRNCRFLQGPET Alr3170 2/4 GEVIGQNCRFLQTNET Npun5680 2/4 SEVIGRNCRFLQGNDL All2875 2/5 TEVVGRNCRFLLGNDT Npun0349 2/3 ADVIGQNCRFLQRTDS Alr1229 8/9 DEVIGQNFRLFQSADI Npun1005 2/2 EEVLGKTPRILQGAKT

Figure 6. Putative flavin-binding PAS domain of subclass #10. A: phylogenetic tree of PAS#10 and two known FMN-binding domains of Phot1. PAS domains of subclass #27 were used for the outgroup. B: multiple alignment of a part of the flavin-binding region of PAS#10. Highlighted cysteine residues in α-helics Eα are a putative covalent binding site for flavin. domains carried conserved histidine or cysteine residues evolution. Alternatively, the latter PAS domains may in the Eα-loop-Fα region. These include two pairs of play species-specific roles as a novel sensor. PAS#10 and a pair of PAS#27, which are homologous PAS#27 is significantly homologous to the heme- to known ligand-binding PAS domains. binding PAS domains of FixL protein. There is a pair PAS#10 is significantly homologous to the FMN- of PAS#27 that is found in the orthologous proteins binding blue light sensor of plants. Phylogenetically, two of Alr2428/Npun1874 (Fig. 1B). Among the putative pairs of orthologous and two pairs of non-orthologous sensory domains, only PAS#27 is conserved between PAS domains of PAS#10 are significantly homologous Alr2428 and Npun1874. In addition to PAS#27, Alr2428 to this FMN-binding domain (Fig. 6A). The cysteine has two more PAS domains and one more GAF domain. residue in the Eα is known to bind the ring of isoallox- Heterologous expression of the PAS#27 of Alr2428 in E. azine covalently during the photocycle.32 Sequence align- coli was successful and allowed us to show its oxygen- ment of the homologous PAS domains shows that the responsive nature.33 Thus, the common PAS domain key cysteine residue is conserved at the same position seems to play more important role than others in these of the orthologous pairs but not in the non-orthologous species. PAS domains (Fig. 6B). It is, thus, suggested that do- Another orthologous pair (PAS#63), which is present mains conserved as orthologous pairs are more likely re- in the orthologous proteins of All0707/Npun2709, has a tain their original structure and function, while the non- conserved cysteine residue in the Fα region (Fig. 7). No- orthologous pairs may have deviated too much during tably, PAS#63 is also found as an orthologous domain No. 2] R. Narikawa et al. 79

N-cap α Aβ Bβ Cα Dα Eα PSIPRED -HHHHHHHH----EEEE-----EEEE-HHHHHH----HHHHH--HHHHHH---HHH 56 All0707_1/1 DQFLTF AQ VLPEPLVLITST GEI LAV NQAATKLFSKTSKA LI GQQLSEF VTD SPQK 56 Npun2709_1/1 EQFLEF AR VLPEPLLLVSGE GQL LAT NQPVADMLGLRRQE LR GKMLFEL VTQ STDD 56 Tery2770_1/1 EQFIEL AR VLPEPSLFLTQK GEI LAM NKLASSIFGYRSKD LQ GQQIYDF VKE SENI 56

Fα Gβ Hβ Iβ PSIPRED HHHHHHHHH------EEEEEEE------EEEEEEEEEEE------EEEEEEE-- 112 All0707_1/1 VSD YLRV CAQS RQ MIL GAF TIHQAV GEGIA CRSQ GAVIQ PR SHDSPAINL LRLEKR 112 Npun2709_1/1 VVK YLQA CSSS RA MVI GSL TLRKND GQTLI CRSQ GAVIQ PW SPESSSLIL LRLENR 112 Tery2770_1/1 VTE YLQA CSRN RK MVI GSL TICTAD GANVV CRSR GAVVE PA SSGLPAKIF LRLEKN 112

Figure 7. Multiple alignment of the entire PAS domain of subclass #63. Predicted secondary structure is shown on upper column. ‘E’ and ‘H’ represent β-sheet and α-helix, respectively. Arrowhead indicates the conserved cysteine residue within the Eα-loop-Fα region. Identical residues are shown in gray. Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021 in an N2-fixing non-heterocyst-forming cyanobacterium, two species, which were further subjected to mapping Trichodesmium erythraeum IMS 101 (Trichodesmium) analysis for the detection of potential ligand binding. On while keeping the same cysteine residue in the Fα re- the other hand, we must take into consideration another gion. Since this PAS#63 is the sole putative sensory possibility that some domains of disagreement may play domain in the two-component sensor protein of All0707, a species-specific role. Namely, they might have been ac- Npun2709 and Tery2770, PAS#63 may be an important quired by lateral gene transfer or counterparts may have sensory domain in N2 fixation. been lost only in one lineage. In either case, related PAS We also detected a novel motif, HPDD-E[FY]R, in 11 domains may be found outside Anabaena and Nostoc by different pairs of orthologous PAS domains.34 Since these large-scale comprehensive search with all of the PAS do- PAS domains comprise eight distinct subclasses (#2, #5, mains in the non-redundant database. #14, #15, #21, #22, #32, #33), they may be engaged Since the PAS domains consist of a wide variety of dis- in ligation of a common cofactor or may have a more gen- tinct subclasses, the standard for computational search eral role such as dimerization of PAS domains or stabi- has not yet fully been established. The custom pro- lization of the flexible PAS fold. Finally, we have not yet file HMM established in this study detected more PAS found any specific features for the 15 remaining pairs of domains (e.g., PAS#63), which were marginally scored the orthologous PAS domains with a cysteine or histidine by the current version of profile HMM in Pfam or residue in the Eα-loop-Fα region. Further bioinformatic SMART. Using this custom profile, we extracted all the and biochemical work to include all the sequenced organ- PAS domains in the complete genomes of 98 prokary- isms will be needed for a thorough understanding of the ote species. It was found that most prokaryotes har- function and evolution of the PAS superfamily. bor less than 50 PAS domains per genome irrespective of the genome size. For example, Streptomyces coelicolor, 3.6. Perspectives whose genome size is 9.05 Mb, harbors 21 PAS domains In this study, we detected 143 PAS domains from An- and Mesorhizobium loti, whose genome size is 7.6 Mb, abaena and 180 PAS domains from Nostoc. By the clus- harbors 45 PAS domains. Conversely, two species of eu- tering and orthologous domain analyses, we detected at ryarchaeota are rich in PAS domains: 163 PAS domains least 64 distinct subclasses, which contain 61 pairs of or- in 59 ORFs for Methanosarcina acetivorans strain. C2A (5.75 Mb) and 96 PAS domains in 37 ORFs for M. mazei thologous PAS domains. These pairs are anciently con- 35,36 served before these species branched from each other. strain Goe1 (4.1 Mb). Since these two species are The other 82 PAS domains from Anabaena and 119 PAS related to each other, clustering and ortholog analyses domains from Nostoc are considered to have appeared by at the domain level would provide deeper evolutionary gene duplication and shuffling after the branching. Since views about the diversity and functionality of PAS do- the total number of PAS domains of Anabaena and Nos- mains. Finally, combinatorial search of closely linked toc is much larger than those in other cyanobacteria, we signaling domains such as GAF, HK and RR will also assume in this communication that domains of disagree- provide insight into the evolution of multidomain signal- ment were mostly produced by duplication in one species ing proteins. or the other. However, those amplified domains must Acknowledgements: This work was partly sup- have quickly diverged in evolution afterward in every case ported by a Research Fellowship for Young Scientists judging from the low sequence conservation. Accordingly, from the Japan Society for the Promotion of Science we could conclude that the functionally important PAS (to S. O.), by a Grant-in-Aid for Scientific Research domains are conserved as orthologous pairs between the “Genome Biology” (to M. O. and M. I.) from the Ministry 80 Shuffling of PAS Domains in Cyanobacteria [Vol. 11, of Education and Science, and by the Program for Pro- mediated signal transduction, Proc. Natl. Acad. Sci. motion of Basic Research Activities for Innovative Bio- U.S.A., 98, 2995–3000. sciences of Japan (to M. O. and M. I.). 16. Amezcua, C. A., Harper, S. M., Rutter, J., and Gardner, K. H. 2002, Structure and interactions of PAS kinase N- terminal PAS domain: model for intramolecular kinase References regulation, Structure (Camb), 10, 1349–1361. 17. Kaneko, T., Nakamura, Y., Wolk, C. P., et al. 2001, Com- 1. Galperin, M. Y., Nikolskaya, A. N., and Koonin, E. V. plete genomic sequence of the filamentous nitrogen-fixing 2001, Novel domains of the prokaryotic two-component cyanobacterium Anabaena sp. strain PCC 7120, DNA signal transduction systems, FEMS Microbiol. Lett., 203, Res., 8, 205–213; 227–253. 11–21. 18. Ohmori, M., Ikeuchi, M., Sato, N., et al. 2001, Char- 2. Ponting, C. P. and Aravind, L. 1997, PAS: a multifunc- acterization of genes encoding multi-domain proteins in tional domain family comes to light, Curr. Biol., 7, R674– the genome of the filamentous nitrogen-fixing cyanobac- 677. terium Anabaena sp. strain PCC 7120, DNA Res., 8, 3. Aravind, L. and Ponting, C. P. 1997, The GAF domain: 271–284.

an evolutionary link between diverse phototransducing Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021 19. Meeks, J., Elhai, J., Potts, M., et al. 2001, An overview of proteins, Trends Biochem. Sci., 22, 458–459. the genome of Nostoc punctiforme, a multicellular, sym- 4. Ponting, C. P. 1997, CBS domains in CIC chloride chan- biotic cyanobacterium., Photosyn. Res., 70, 85–106. nels implicated in myotonia and nephrolithiasis (kidney 20. Altschul, S. F., Madden, T. L., Schaffer, A. A., et al. stones), J. Mol. Med., 75, 160–163. 1997, Gapped BLAST and PSI-BLAST: a new generation 5. Aravind, L. and Ponting, C. P. 1999, The cytoplasmic of protein database search programs, Nucleic Acids Res., helical linker domain of histidine kinase and 25, 3389–3402. methyl-accepting proteins is common to many prokary- 21. Bateman, A., Birney, E., Cerruti, L., et al. 2002, The otic signalling proteins, FEMS Microbiol. Lett., 176, Pfam protein families database, Nucleic Acids Res., 30, 111–116. 276–280. 6. Hofmann, K. and Bucher, P. 1995, The FHA domain: a 22. Schultz, J., Milpetz, F., Bork, P., and Ponting, C. P. putative nuclear signalling domain found in protein ki- 1998, SMART, a simple modular architecture research nases and transcription factors, Trends Biochem. Sci., tool: identification of signaling domains, Proc. Natl. 20, 347–349. Acad. Sci. U.S.A., 95, 5857–5864. 7. West, A. H. and Stock, A. M. 2001, Histidine kinases and 23. Thompson, J. D., Gibson, T. J., Plewniak, F., response regulator proteins in two-component signaling Jeanmougin, F., and Higgins, D. G. 1997, The systems, Trends Biochem. Sci., 26, 369–376. CLUSTAL X windows interface: flexible strategies for 8. Hurley, J. H. 1998, The adenylyl and guanylyl cyclase multiple aided by quality analysis superfamily, Curr. Opin. Struct. Biol., 8, 770–777. tools, Nucleic Acids Res., 25, 4876–4882. 9. Hanks, S. K. and Hunter, T. 1995, Protein kinases 6. The 24. Page, R. D. 1996, TreeView: an application to display eukaryotic protein kinase superfamily: kinase (catalytic) phylogenetic trees on personal computers, Comput. Appl. domain structure and classification, FASEB J., 9, 576– Biosci., 12, 357–358. 596. 25. McGuffin, L. J., Bryson, K., and Jones, D. T. 10. Taylor, B. L., Zhulin, I. B., and Johnson, M. S. 1999, 2000, The PSIPRED prediction server, Aerotaxis and other energy-sensing behavior in bacteria, Bioinformatics, 16, 404–405. Annu. Rev. Microbiol., 53, 103–128. 26. Eddy, S. R. 1998, Profile hidden Markov models, 11. Borgstahl, G. E., Williams, D. R., and Getzoff, E. D. Bioinformatics, 14, 755–763. 1995, 1.4 A structure of photoactive yellow protein, a 27. Suzuki, I., Kanesaki, Y., Mikami, K., Kanehisa, M., and cytosolic photoreceptor: unusual fold, active site, and Murata, N. 2001, Cold-regulated genes under control of chromophore, Biochemistry, 34, 6278–6287. the cold sensor Hik33 in Synechocystis, Mol. Microbiol., 12. Gong, W., Hao, B., Mansy, S. S., Gonzalez, G., 40, 235–244. Gilles-Gonzalez, M. A., and Chan, M. K. 1998, Struc- 28. van Waasbergen, L. G., Dolganov, N., and Grossman, ture of a biological oxygen sensor: a new mechanism for A. R. 2002, nblS, a gene involved in controlling heme-driven signal transduction, Proc. Natl. Acad. Sci. photosynthesis-related gene expression during high light U.S.A., 95, 15177–15182. and nutrient stress in Synechococcus elongatus PCC 7942, 13. Miyatake, H., Mukai, M., Park, S. Y., et al. 2000, Sen- J. Bacteriol., 184, 2481–2490. sory mechanism of oxygen sensor FixL from Rhizobium 29. Hirani, T. A., Suzuki, I., Murata, N., Hayashi, H., meliloti: crystallographic, mutagenesis and resonance and Eaton-Rye, J. J. 2001, Characterization of a two- Raman spectroscopic studies, J. Mol. Biol., 301, 415– component signal transduction system involved in the in- 431. duction of alkaline phosphatase under phosphate-limiting 14. Morais Cabral, J. H., Lee, A., Cohen, S. L., Chait, B. T., conditions in Synechocystis sp. PCC 6803, Plant Mol. Li, M., and Mackinnon, R. 1998, Crystal structure and Biol., 45, 133–144. functional analysis of the HERG potassium channel N 30. Katayama, M. and Ohmori, M. 1997, Isolation and char- terminus: a eukaryotic PAS domain, Cell, 95, 649–655. acterization of multiple adenylate cyclase genes from 15. Crosson, S. and Moffat, K. 2001, Structure of a flavin- the cyanobacterium Anabaena sp. strain PCC 7120, binding plant photoreceptor domain: insights into light- J. Bacteriol., 179, 3588–3593. No. 2] R. Narikawa et al. 81

31. Kanacher, T., Schultz, A., Linder, J. U., and Schultz, J. 34. Narikawa, R., Okamoto, S., Ikeuchi, M., and Ohmori, M. E. 2002, A GAF-domain-regulated adenylyl cyclase from 2003, Newly identified motifs within PAS domains of fila- Anabaena is a self-activating cAMP switch, EMBO J., mentous cyanobacteria, Genome Inform. Ser. Workshop 21, 3672–3680. Genome Inform., 14, 444–445. 32. Salomon, M., Christie, J. M., Knieb, E., Lempert, U. 35. Galagan, J. E., Nusbaum, C., Roy, A., et al. 2002, The and Briggs, W. R. 2000, Photochemical and mutational genome of M. acetivorans reveals extensive metabolic and analysis of the FMN-binding domains of the plant blue physiological diversity, Genome Res., 12, 532–542. light receptor, phototropin, Biochemistry, 39, 9401–9410. 36. Deppenmeier, U., Johann, A., Hartsch, T., et al. 2002, 33. Narikawa, R., Miyatake, H., Kim, S., et al. 2003, Bio- The genome of Methanosarcina mazei: evidence for lat- chemical analysis of a novel oxygen sensor protein in the eral gene transfer between bacteria and archaea, J. Mol. N2-fixing cyanobacterium Anabaena sp. PCC 7120, Plant Microbiol. Biotechnol., 4, 453–461. Cell Physiol. Suppl., 44, 81. Downloaded from https://academic.oup.com/dnaresearch/article/11/2/69/534432 by guest on 28 September 2021