Proc. Nati. Acad. Sci. USA Vol. 85, pp. 5006-5010, July 1988 Biochemistry Sequence and nitrate regulation of the Arabidopsis thaliana mRNA encoding nitrate , a metalloflavoprotein with three functional domains (nitrate assimilation/plant gene expression) NIGEL M. CRAWFORD*t, MARY SMITH*, DANIEL BELLISSIMOt, AND RONALD W. DAVIS* *Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305; and *Department of Biochemistry, Duke University Medical Center, Durham, NC 27710 Contributed by Ronald W. Davis, March 28, 1988

ABSTRACT The sequence of (EC Chlorella releases an intact functional domain that retains the 1.6.6.1) mRNA from the plant Arabidopsis thaliana has been and molybdenum groups (6). This fragment can still determined. A 3.0-kilobase-long cDNA was isolated from a reduce nitrate, but only with reduced methyl viologen as the AgtlO cDNA library of Arabidopsis leaf poly(A)+ RNA. The reductant. It has lost the ability of using NADH as the cDNA hybridized to a 3.2-kilobase mRNA whose level in- reductant, which is characteristic ofan FAD-binding domain. creased 15-fold in response to treatment of the plant with Based on these and other data, it has been proposed that each nitrate. An open reading frame encoding a 917 amino acid subunit of nitrate reductase consists of three domains: one was found in the sequence. This protein is very similar that binds FAD, one that binds heme, and one that binds to tobacco nitrate reductase, being >80% identical within a molybdenum-pterin (5). We have sequenced a nitrate reduc- section of 450 amino acids. By comparing the Arabidopsis tase cDNA from a higher plant to obtain the complete protein sequence with other protein sequences, three functional primary structure of the . We wished to see if the domains were deduced: (i) a molybdenum-pterin-binding do- sequence would be consistent with this three-domain model. main that is similar to the molybdenum-pterin-bindg domain We chose to clone and sequence a nitrate reductase of rat liver sulfite oxidase, (it) a heme-binding domain that is cDNA§ from the plant Arabidopsis thaliana (Crucifer fam- similar to in the cytochrome bs superfamily, and (iii) ily). In addition to shedding light on the structure of nitrate an FAD-binding domain that is similar to NADH-cytochrome reductase, this organism offers exceptional opportunities for reductase. studying molecular genetics in higher plants (7). One can bs select for mutants that are blocked in the nitrate-assimilation pathway by using chlorate, the chlorine analog of nitrate. Nitrate is an important source of nitrogen for higher plants. Chlorate is reduced to a toxic compound by nitrate reductase. Plants assimilate nitrate by taking nitrate up from the soil and Resistant mutants are unable to reduce chlorate because of a reducing it to ammonia, which is incorporated directly into defect in the nitrate reductase apoenzyme or in the synthesis the amino acid pool. The conversion of nitrate to ammonia is of the molybdenum-pterin . Chlorate-resistant mu- catalyzed by two , nitrate reductase and nitrite tants of Arabidopsis have been characterized and shown to reductase. These enzymes are present in the roots and leaves fall into eight complementation groups (8). In addition, nitrate of plants. The levels of these enzymes respond to several reductase levels in the leaves of Arabidopsis are influenced environmental stimuli, including nitrate and light. cDNA by nitrate (9). We wished to take advantage of this system by clones encoding the two have been isolated from isolating and characterizing a nitrate reductase cDNA from several species of plants and have been used as nucleic acid this plant. Previously, we reported the isolation of a partial probes to demonstrate that nitrate and nitrite reductase cDNA encoding nitrate reductase from squash (Cucurbita mRNA levels increase substantially upon nitrate treatment of family) (1). Because the Cucurbita family is so closely related plants (1-4). The nitrate-assimilation pathway is one of the to the Crucifer family, it was possible to use the squash few systems documented in plants that has a substrate- cDNA to isolate the Arabidopsis clone. Our work was further inducible enzyme. aided by the report of a partial sequence of nitrate reductase We have focused on the first enzyme in the pathway, from tobacco (3). nitrate reductase (EC 1.6.6.1), because of the wealth of biochemical information available for this protein (5). It is a multicenter electron-transfer protein that catalyzes the reac- MATERIALS AND METHODS tion Plant Material. A. thaliana (strain Columbia) plants were grown in vermiculite/perlite/sand (10:3:1 weight ratio) under NO- + NADH + H+ -NOj + NAD+ + H20 continuous illumination and were irrigated every other day with a nutrient medium containing 10 mM potassium phos- In higher plants the native enzyme is a homodimer of 110-115 phate (pH 5.5), 2 mM MgSO4, 1 mM CaCI2, 0.1 mM kDa. Each subunit is associated with three prosthetic groups: Fe-EDTA, 50 uM H3BO3, 12 ,uM MnSO4, 1 ttM ZnC12, 1 ,uM flavin, heme, and molybdenum, each one comprising a redox CuSO4, 0.2 ttM Na2MoO4, and one of the following: 5 mM center. The flavin group is FAD; the heme is typical of those KNO3 or 3 mM NH4NO3 or 2 mM (NH4)2SO4. Leaves were found in b-type cytochromes, and the molybdenum is com- harvested 16-20 days after planting and then stored at plexed with a small organic pterin cofactor. There is some - 850C. evidence that these prosthetic groups are bound by distinct domains. Proteolytic digestion of nitrate reductase from tPresent address: Department of Biology, C-016, University of California at San Diego, La Jolla, CA 92093. The publication costs of this article were defrayed in part by page charge §The sequence reported in this paper is being deposited in the payment. This article must therefore be hereby marked "advertisement" EMBL/GenBank data base (IntelliGenetics, Mountain View, CA, in accordance with 18 U.S.C. §1734 solely to indicate this fact. and Eur. Mol. Biol. Lab., Heidelberg) (accession no. J03240). 5006 Downloaded by guest on September 24, 2021 Biochemistry: Crawford et al. Proc. Natl. Acad. Sci. USA 85 (1988) 5007 Protein Procedures. To prepare protein extracts, frozen NH+ NH+ NH+ tissue was ground at 00C in 100 mM potassium phosphate, pH + I 7.5/1 mM EDTA/1 mM dithiothreitol. The homogenate was NH+ NO03 NO03 Cl- NO- centrifuged at 40C at 12,000 x g for 5 min. The supernatant was assayed for nitrate reductase activity as described (1); one unit of activity is defined as the production of 1 nmol of nitrite per min at 30'C. Peptide sequences of sulfite oxidase were generated from _ _., the molybdenum-binding domain ofthe enzyme produced by 3.2 kb - I tryptic cleavage as described (10, 11). Nucleic Acid Procedures. Poly(A) + RNA was prepared (1) from the frozen leaf tissue with a yield of 0.5-1.0 tug of poly(A)+ RNA per g (fresh weight). RNA blot analysis was performed as described (1). A AgtlO (12) cDNA library was prepared (1) by using cDNA purified by gel electrophoresis. RNA LEVELS 1.0 14 16 0.9 14 The library was screened with radiolabeled squash nitrate NRA LEVELS 0.9 28 20 0.7 11 reductase cDNA (pCmc-1) as described (1, 12) under low- stringency conditions: 30% (vol/vol) formamide/5 x SSPE FIG. 1. RNA blot analysis of Arabidopsis mRNA. Arabidopsis (13)/5 x BFP (13)/0.2 mg of single-stranded DNA per ml/5% plants were grown as described in Materials and Methods and (wt/vol) sodium dextran sulfate/0.2 mg of poly(U) per ml at irrigated with a defined medium with various nitrogen sources. 37°C with a final wash of 0.2 x SSPE/0.1% NaDodSO4 at Poly(A)+ RNA was prepared from the leaves of plants, treated with 400C. glyoxal, and then fractionated by agarose gel electrophoresis (1 ,ug per lane). The RNA was blotted onto a nylon filter and hybridized cDNA inserts from the isolated A phage were subcloned with radiolabeled pAtc-46 DNA. Each lane corresponds to RNA into the EcoRI site ofthe Bluescript vector KS + (Stratagene, from plants grown with the nitrogen source designated above the San Diego, CA). Random deletions were constructed by use lane. Densitometer tracings of the autoradiogram were used to ofDNase 1 (14), and then single-stranded DNA was prepared measure RNA levels. Hybridization of the blots to radiolabeled and sequenced by the chain-termination method (15). Se- control cDNA verified that the same amount of RNA had been quence analysis and comparisons were performed on the loaded onto each lane (data not shown). Nitrate reductase activity BIONET system (IntelliGenetics, Mountain View, CA) with (NRA) levels were determined in the same tissues used for the RNA the XFASTP, SEQ, PEP, and GENALIGN programs. preparations. RESULTS are identical to those that have been observed for nitrate Isolation ofNitrate Reductase cDNA Clones. Previously, we reductase mRNA in squash cotyledons (1). The Arabidopsis isolated a cDNA clone (pCmc-1) that encoded part of the cDNA also hybridized to a mRNA -0.8 kb in length whose nitrate reductase protein from squash, Cucurbita maxima (1). level was the same in each lane (data not shown). The identity This clone was used as a heterologous probe in an attempt to of this mRNA is unknown. isolate a nitrate reductase cDNA clone from Arabidopsis. Sequence and Comparisons. The pAtc-46 cDNA was se- This strategy had a good chance of working because nitrate quenced in both directions (Fig. 2). Primer extension, S1 reductase from plants as distantly related as barley and nuclease mapping, and sequence analysis indicated that the Arabidopsis share at least one common epitope (16), and the start of transcription is 17 + 5 nucleotides upstream from the pCmc-1 DNA weakly hybridizes to a mRNA from Arabi- 5' end of the cDNA (17). Thus, the mRNA was calculated to dopsis that has properties identical with those of the nitrate be 3050 nucleotides long without the poly(A) tail, a length reductase mRNA identified in squash; that is, it is 3.2 similar to the 3.2 kb estimated by agarose gel electrophoresis. kilobases (kb) long and is nitrate-regulated (unpublished data An open reading frame encoding a protein of917 amino acids and ref. 1). (Mr = 103,000; p1 = 6.1) was deduced from the sequence A AgtlO cDNA library was prepared from poly(A) + RNA (Fig. 2). No initiation codon was found upstream from the one from the leaves of nitrate-grown Arabidopsis plants. The shown. The calculated molecular mass is close to the pub- cDNA library was enriched for sequences longer than 3 kb. lished mass (115 kDa) for nitrate reductase for other dicot- Sixteen positive phage were obtained from screening 5,000 yledonous plants as determined by polyacrylamide gel elec- recombinant phage by using the pCmc-1 DNA as probe. One trophoresis (5). The Arabidopsis protein has a high propor- positive phage (AlOAtc-22) with a 3.1-kb insert was selected tion of proline residues (6.1%) and may migrate slightly for further analysis. The insert was subcloned, and the slower in polyacrylamide gels than would be predicted from plasmid containing this insert was designated pNN351 (pAtc- the calculated mass. When the protein sequence was com- 46). pared to a partial sequence of nitrate reductase from tobacco RNA Analysis. RNA blot hybridization revealed that the (3), a region of 450 amino acids (residues 179-629 of the Arabidopsis cDNA hybridized to a leaf mRNA that is 3.2 kb Arabidopsis sequence) was found to be very similar, having long and that is nitrate-regulated (Fig. 1). Plants grown with >80o identical amino acids. We concluded that the Arabi- nitrate (lane 2) had about 15 times more 3.2-kb mRNA than dopsis cDNA encoded nitrate reductase. plants grown with ammonia as the sole source of nitrogen To identify functional domains of the enzyme, the Protein (lane 1). A mixing experiment demonstrated that it is the Identification Resource$ and Swiss-Protli were searched for presence or absence of nitrate that affects the 3.2-kb mRNA sequences similar to the nitrate reductase sequence. Two levels and not a repressive effect of ammonia. When plants groups of proteins were found. The first group is similar to were grown on both ammonia and nitrate, the 3.2-kb mRNA residues 540-620 of nitrate reductase (Fig. 3). This group is level was as high as in plants grown on nitrate alone (lane 3). the superfamily of proteins that bind heme The 3.2-kb mRNA was also increased 15-fold in the leaves of (21) and includes microsomal, mitochondrial, and erythro- ammonia-grown plants by irrigating them with nitrate- containing medium (lanes 4 and 5). The levels of nitrate VProtein Identification Resource (1988) Protein Sequence Database reductase activity were measured in the same tissue samples (Natl. Biomed. Res. Found., Washington, DC), Release 15. and were found to correlate very well with the mRNA levels ISwiss-Prot (1988) Protein Sequence Database (Eur. Mol. Biol. (Fig. 1). These properties of the Arabidopsis 3.2-kb mRNA Lab., Heidelberg), Release 6. Downloaded by guest on September 24, 2021 5008 Biochemistry: Crawford et al. Proc. Natl. Acad. Sci. USA 85 (1988)

TTT CTC ACT GAC AAG AGA GAG ATA GAG AGA GAA AAA GGC TCT GCT TTC TAC ATT 54 TGC TGG TTT AGA GTG AAG ACT AAC GTG TGC AAG CCA CAC AAG GGA GAG ATT GGG 1566 Cys Trp Phe Arg Val Lys Thr Asn Val Cys Lys Pro His Lys Gly Glu Ile Gly 496 ATT TAC GAT TAT ACA CTT TCC AAC ATG GCG GCC TCT GTA GAT AAT CGC CAA TAC 108 MET Ala Ala Ser Val Asp Asn Arg Gln Tyr 10 ATT GTG TTC GAG CAT CCA ACG CTT CCT GGT AAT GAA TCT GGT GGA TGG ATG GCG 1620 Ile Val Phe Glu His Pro Thr Leu Pro Gly Asn Glu Ser Gly Gly Trp MET Ala 514 GCT CGT CTC GAG CCA GGT TTG AAC GGC GTG GTT CGT TCT TAC AAA CCT CCC GTT 162 Ala Arg Leu Glu Pro Gly Leu Asn Gly Val Val Arg Ser Tyr Lys Pro Pro Val 28 AAG GAA CGT CAC CTC GAA AAA TCG GCT GAC GCG CCT CCT AGT CTA AAG AAG TCT 1674 Lys Glu Arg His Leu Glu Lys Ser Ala Asp Ala Pro Pro Ser Leu Lys Lys Ser 532 CCA GGC CGG TCC GAT TCC CCT AAG GCG CAC CAG AAC CAA ACC ACC AAC CAA ACC 216 Pro Gly Arg Ser Asp Ser Pro Lys Ala His Gln Asn Gln Thr Thr Asn Gin Thr 46 GTC TCG ACG CCG TTT ATG AAC ACA ACT GCG AAG ATG TAC TCG ATG TCC GAG GTC 1728 Val Ser Thr Pro Phe Met Asn Thr Thr Ala Lys Met Tyr Ser Met Ser Glu Val 550 GTG TTC TTG AAA CCA GCC AAG GTT CAT GAC GAT GAC GAA GAC GTG TCG AGC GAA 270 Val Phe Leu Lys Pro Ala Lys Val His Asp Asp Asp Glu Asp Val Ser Ser Glu 64 AAG AAG CAT AAT TCG GCT GAC TCT TGC TGG ATC ATT GTC CAT GGA CAT ATC TAT 1782 Lys Lys His Asn Ser Ala Asp Ser Cys Trp Ile Ile Val His Gly His Ile Tyr 568 GAC GAG AAC GAG ACA CAC AAC AGC AAC GCC GTG TAC TAC AAG GAG ATG ATA AGA 324 Asp Glu Asn Glu Thr His Asn Ser Asn Ala Val Tyr Tyr Lys Glu Met Ile Arg 82 GAT TGT ACA CGA TTC CTT ATG GAT CAC CCG GGT GGT TCG GAT TCA ATC TTG ATC 1836 Asp Cys Thr Arg Phe Leu Met Asp His Pro Gly Gly Ser Asp Ser lie Leu Ile 586 AAA TCC AAC GCC GAG CTT GAA CCG TCC GTT TTG GAC CCG AGG GAC GAA TAC ACG 378 Lys Ser Asn Ala Glu Leu Glu Pro Ser Val Leu Asp Pro Arg Asp Glu Tyr Thr 100 AAT GCT GGT ACG GAT TGT ACG GAG GAG TTT GAA GCC ATT CAC TCG GAT AAA GCC 1890 Asn Ala Gly Thr Asp Cys Thr Glu Glu Phe Glu Ala Ile His Ser Asp Lys Ala 604 GCT GAT AGC TGG ATC GAG CGT AAC CCT TCC ATG GTA CGT CTC ACA GGG AAA CAT 432 Ala Asp Ser Trp Ile Glu Arg Asn Pro Ser Met Val Arg Leu Thr Gly Lys His 118 AAG AAG ATG CTT GAG GAT TAC CGT ATC GGT GAG CTC ATC ACC ACT GGT TAT TCC 1944 Lys Lys Met Leu Glu Asp Tyr Arg Ile Gly Glu Leu Ile Thr Thr Gly Tyr Ser 622 CCC TTC AAC TCC GAG GCG CCT CTT AAC CGT TTA ATG CAC CAC GGG TTT ATC ACC 486 Pro Phe Asn Ser Glu Ala Pro Leu Asn Arg Leu Met His His Gly Phe Ile Thr 136 TCT GAC TCT TCC TCG CCT AAC AAC TCG GTT CAC GGT TCA TCC GCC GTG TTC TCG 1998 Ser Asp Ser Ser Ser Pro Asn Asn Ser Val His Gly Ser Ser Ala Val Phe Ser 640 CCT GTC CCG TTG CAC TAC GTT CGT AAC CAC GGC CAC GTC CCT AAA GCC CAA TGG 540 Pro Val Pro Leu His Tyr Val Arg Asn His Gly His Val Pro Lys Ala Gln Trp 154 CTG TTG GCT CCC ATT GGA GAG GCG ACT CCG GTT AGG AAC CTC GCT TTG GTT AAT 2052 Leu Leu Ala Pro Ile Gly Glu Ala Thr Pro Val Arg Asn Leu Ala Leu Val Asn 658 GCC GAA TGG ACG GTC GAG GTG ACC GGA TTC GTC AAA CGG CCC ATG AAA TTC ACC 594 Ala Glu Trp Thr Val Glu Val Thr Gly Phe Val Lys Arg Pro Met Lys Phe Thr 172 CCC CGG GCT AAA GTC CCG GTT CAA CTC GTC GAA AAG ACT TCC ATT TCT CAT GAT 2106 Pro Arg Ala Lys Val Pro Val Gln Leu Val Glu Lys Thr Ser Ile Ser His Asp 676 ATG GAC CAG CTC GTC TCC GAG TTT GCT TAC CGC GAG TTC GCC GCG ACG CTA GTC 648 Met Asp Gin Leu Val Ser Glu Phe Ala Tyr Arg Glu Phe Ala Ala Thr Leu Val 190 GTT CGT AAA TTC CGG TTT GCT TTA CCG GTT GAG GAT ATG GTT CTA GGC TTA CCG 2160 Val Arg Lys Phe Arg Phe Ala Leu Pro Val Glu Asp Met Val Leu Gly Leu Pro 694 TGC GCG GGG AAC CGC CGT AAG GAA CAG AAC ATG GTG AAG AAG TCA AAG GGA TTC 702 Cys Ala Gly Asn Arg Arg Lys Glu Gln Asn Met Val Lys Lys Ser Lys Gly Phe 208 GTT GGT AAG CAC ATT TTC CTT TGC GCC ACC ATC AAT GAC AAG CTC TGC CTC AGA 2214 Val Gly Lys His Ile Phe Leu Cys Ala Thr Ile Asn Asp Lys Leu Cys Leu Arg 712 AAC TGG GGA TCC GCC GGA GTT TCC ACC TCC GTG TGG CGT GGT GTC CCT CTC TGC 756 Asn Trp Gly Ser Ala Gly Val Ser Thr Ser Val Trp Arg Gly Val Pro Leu Cys 226 GCT TAC ACA CCA AGC AGC ACC GTT GAT GTG GTT GGC TAC TTC GAG CTC GTG GTC 2268 Ala Tyr Thr Pro Ser Ser Thr Val Asp Val Val Gly Tyr Phe Glu Leu Val Val 730 GAC GTA CTG CGT CGC TGC GGG ATC TTT AGC CGA AAA GGC GGC GCT CTC AAC GTC 810 Asp Val Leu Arg Arg Cys Gly Ile Phe Ser Arg Lys Gly Gly Ala Leu Asn Val 244 AAG ATT TAC TTT GGC GGT GTC CAC CCA AGA TTC CCT AAC GGC GGG CTC ATG TCT 2322 Lys Ile Tyr Phe Gly Gly Val His Pro Arg Phe Pro Asn Gly Gly Leu Met Ser 748 TGC TTC GAA GGG TCG GAG GAT CTT CCG GGC GGT GCC GGA ACT GCT GGT TCC AAA 684 Cys Phe Glu Gly Ser Glu Asp Leu Pro Gly Gly Ala Gly Thr Ala Gly Ser Lys 262 CAG TAC CTA GAC TCT TTG CCT ATA GGG TCA ACT TTG GAG ATT AAA GGA CCA TTG 2376 Gin Tyr Leu Asp Ser Leu Pro Ile Gly Ser Thr Leu Glu Ile Lys Gly Pro Leu 766 TAC GGA ACG AGC ATC AAG AAG GAA TAT GCC ATG GAT CCA TCA AGA GAC ATC ATT 918 Tyr Gly Thr Ser Ile Lys Lys Glu Tyr Ala Met Asp Pro Ser Arg Asp Ile Ile 280 GGT CAC GTT GAG TAT CTC GGC AAG GGT AGT TTC ACG GTT CAC GGT AAA CCA AAG 2430 Gly His Val Glu Tyr Leu Gly Lys Gly Ser Phe Thr Val His Gly Lys Pro Lys 784 TTG GCT TAT ATG CAA AAC GGA GAG TAT CTA ACA CCA GAC CAC GGT TTT CCG GTT 972 Leu Ala Tyr Met Gln Asn Gly Glu Tyr Leu Thr Pro Asp His Gly Phe Pro Val 298 TTT GCT GAT AAA TTG GCA ATG TTG GCA GGT GGA ACC GGA ATA ACT CCG GTT TAC 2484 Phe Ala Asp Lys Leu Ala Met Leu Ala Gly Gly Thr Gly Ile Thr Pro Val Tyr 802 CGG ATC ATC ATC CCC GGT TTC ATT GGT GGC CGG ATG GTT AAA TGG TTG AAA CGA 1026 Arg Ile Ile Ile Pro Gly Phe Ile Gly Gly Arg Met Val Lys Trp Leu Lys Arg 316 CAA ATT ATC CAA GCC ATT CTC AAG GAT CCA GAG GAT GAG ACT GAA ATG TAC GTC 2538 Gln Ile Ile Gln Ala Ile Leu Lys Asp Pro Glu Asp Glu Thr Glu Met Tyr Val 820 ATC ATT GTC ACA ACT AAA GAA TCC GAC AAT TTC TAC CAT TTC AAG GAC AAC AGA 1080 Ile Ile Val Thr Thr Lys Glu Ser Asp Asn Phe Tyr His Phe Lys Asp Asn Arg 334 ATT TAT GCT AAC CGG ACC GAG GAA GAT ATT CTC CTA AGG GAG GAA CTG GAT GGT 2592 Ile Tyr Ala Asn Arg Thr Glu Glu Asp Ile Leu Leu Arg Glu Glu Leu Asp Gly 838 GTT TTA CCT TCT TTG GTA GAC GCC GAA CTC GCC GAC GAA GAA GGT TGG TGG TAT 1134 Val Leu Pro Ser Leu Val Asp Ala Glu Leu Ala Asp Glu Glu Gly Trp Trp Tyr 352 TGG GCA GAG CAA TAC CCG GAC CGG TTA AAG GTT TGG TAC GTA GTG GAA TCA GCT 2646 Trp Ala Glu Gin Tyr Pro Asp Arg Leu Lys Val Trp Tyr Val Val Glu Ser Ala 856 AAG CCA GAG TAC ATA ATC AAC GAG CTA AAC ATA AAC TCC GTG ATT ACG ACG CCA 1188 Lys Pro Glu Tyr Ile Ile Asn Glu Leu Asn Ile Asn Ser Val Ile Thr Thr Pro 370 AAG GAA GGT TGG GCA TAC AGT ACC GGG TTT ATT TCC GAG GCG ATT ATG CGA GAA 2700 Lys Glu Gly Trp Ala Tyr Ser Thr Gly Phe Ile Ser Glu Ala Ile Met Arg Glu 874 TGT CAC GAG GAG ATT CTT CCC ATC AAC GCT TTC ACA ACC CAA AGA CCT TAT ACT 1242 Cys His Glu Glu Ile Leu Pro Ile Asn Ala Phe Thr Thr Gln Arg Pro Tyr Thr 388 CAT ATC CCT GAT GGA TTA GAT GGC TCA GCC CTT GCC ATG GCT TGC GGA CCA CCA 2754 His Ile Pro Asp Gly Leu Asp Gly Ser Ala Leu Ala Met Ala Cys Gly Pro Pro 892 TTA AAG GGT TAC GCA TAT TCC GGA GGT GGA AAA AAA GTG ACC CGT GTG GAG GTC 1296 Leu Lys Gly Tyr Ala Tyr Ser Gly Gly Gly Lys Lys Val Thr Arg Val Glu Val 406 CCG ATG ATT CAG TTT GCG GTT CAG CCG AAT TTG GAG AAG ATG CAA TAT AAC ATC 2808 Pro MET Ile Gln Phe Ala Val GIn Pro Asn Leu Glu Lys Met Gln Tyr Asn Ile 910 ACG GTA GAT GGT GGA GAG ACA TGG AAC GTA TGT GCA CTT GAC CAT CAA GAG AAG 1350 Thr Val Asp Gly Gly Glu Thr Trp Asn Val Cys Ala Leu Asp His Gln Glu Lys 424 AAG GAG GAT TTC TTG ATA TTC TAG TCT AAA GCC AAG ATA TTT CAA AGT CAA AAC 2862 Lys Glu Asp Phe Leu Ile Phe CCA AAC AAG TAT GGG AAG TTC TGG TGT TGG TGT TTT TGG TCA CTT GAG GTT GAG 1404 Pro Asn Lys Tyr Gly Lys Phe Trp Cys Trp Cys Phe Trp Ser Leu Glu Val Glu 442 GTT AAG TCG TCA AAA AGC CTA GTG TTA TGA TAG ATT CTA AAT AAG TAT TGA GGG 2916 GTT TTG GAC TTG CTT AGT GCC AAA GAG ATT GCT GTT CGT GCA TGG GAC GAG ACT 1458 ATT TGT TTG TAT ATA ATG TTG GTT CTT TAA AGT CTT GGA TTT GGA AAT ATA ATG 2970 Val Leu Asp Leu Leu Ser Ala Lys Glu Ile Ala Val Arg Ala Trp Asp Glu Thr 460 TGT TCG TTG TAT TCA CGA TCG ATA TTT TTT TTC ACA ATA ATA TAC AAA AAG TAA 3024 CTC AAC ACG CAG CCC GAG AAA ATG ATA TGG AAT CTC ATG GGG ATG ATG AAT AAC 1512 Leu Asn Thr Gin Pro Glu Lys MET Ile Trp Asn Leu Met Gly Met Met Asn Asn 478 TTT CAG A FIG. 2. Sequence ofthe Arabidopsis nitrate reductase cDNA. The cDNA was sequenced in both directions by the chain-termination method. The protein sequence was determined by the TRANSLATE function ofthe PEP program of BIONET. The last nucleotide preceded the poly(A) tract in the cDNA. cyte cytochrome b5 (18, 22-24); yeast flavocytochrome b2 molybdenum-pterin-binding domain was found. Therefore, (19); and sulfite oxidase (11, 20). Only three representative the sequences from proteins known to bind molybdenum sequences are shown for each class (Fig. 3). These proteins were compared directly with nitrate reductase. Of the three are -50% identical to nitrate reductase. The second group of proteins examined (xanthine reductase, biotin sulfoxide re- proteins is similar to residues 640-917 (Fig. 4). This group ductase, and sulfite oxidase), only one, sulfite oxidase, contains the microsomal and erythrocyte NADH-cyto- revealed a significant match. Two short peptide sequences chrome b5 reductases, which bind FAD (25, 26). Only the were generated from the molybdenum-binding domain of rat bovine microsomal enzyme is shown (Fig. 4). Approximately liver sulfite oxidase. Sulfite oxidase has been characterized 45% identity was found. For both groups of proteins, the from rat and chicken (11, 20). It has two domains, one that similarity increases substantially when conservative amino binds heme and one that binds molybdenum. The two acid changes are incorporated into the comparison. No other domains have been separated by tryptic cleavage (10). The similar proteins were found. sequence of the heme domain is shown in Fig. 3. The From this analysis, we could account for only half of the sequence of two oligopeptides from the molybdenum domain nitrate reductase protein. Heme- and FAD-binding regions were determined and found to be similar to the nitrate could be assigned, but no sequence corresponding to a reductase sequence (Fig. 5). Downloaded by guest on September 24, 2021 Biochemistry: Crawford et al. Proc. Natl. Acad. Sci. USA 85 (1988) 5009 (A) 542 619 ATNR: AKMYSMSEVKKHNSADSCWIIVHGHIYDCTRFLMDHPGGSDSILINAGTDCTEEFEAI-HSDKAKKMLEDYRIGELITT l 11 111 11 11111 111 11 1111 11 11 11 11 1 1111 CBCH5: GRYYRLEEVWKHNESDSTWIIVHHRIYDITKFLDEHPGGEEVLREQAGGDATEDFEDVGHSTDARALSETFIIGELHPD 1 79 (B) 534 604 ATNR: STPFMNTTAKMYSMSEVKKHNSADSCWIIVHGHIYDCTRFLMDHPGGSDSILINAGTDCTEEFEAIHSDKA II III 11 11 1111 1iii III II A24583: NEPKLDMNKQKISPAEVAKHNKPDDCWVVINGYVYDLTRFLPNHPGGQDVIKFNAGKDVTAIFEPLHAPNV 80 150 (C) 542 619 l l ATNR: AKMYSMSEVKKHNSADSCWIIVHG-HIYDCTRFLMDHPGGSDSILINAGTDCTEEFEAIHSDKAKKMLE---DYRIGELITT 11 11 1111 11 11 11 III RLSO: YPRYTREEVGRHRSPEERVWVTHGTDVFDVTDFVELHPGGPDKILLAAGGALEPFWALYAVHGEPHVLELLQQYKVGELSPD FIG. 3. Sequence comparison of the heme-binding domain. The amino acid sequences of chicken microsomal cytochrome b5 (CBCH5, ref. 18) (A), yeast flavocytochrome b2 (A24583, ref. 19) (B), and chicken liver sulfite oxidase (CLSO, ref. 20) (C) were aligned with the A. thaliana nitrate reductase sequence (ATNR) to maximize similarities. Identical amino acid positions are indicated. Standard one-letter amino acid symbols are used here and in Figs. 4 and 5.

638 732 ATNR: VFSLLAPIGEATPVRNLALVNPRAKVPVQLVEKTSISHDVRKFRFALPVEDMVLGLPVGKHIFLCATINDKLCLRAYTPSSTVDVVGYFELVVKI A23896: LYSLIMKLFQ-RSTPAITLENPDIKYPLRLIDKEVISHDTRRFRFALPSPEHILGLPVGQHIYLSARIDGNLVIRPYTPVSSDDDKGFVDLVIKV 18 111 733 822 ATNR: YFGGVHPRFPNGGLMSQYLDSLPIGSTLEIKGPLGHVEYLGKGSFTVY----GKPKF--ADKLAMLAGGTGITPVYQIIQAILKDPEDETEMYVIY 11 IIIIIIIII111I1II1II1I11I 1 111 1 1I 1111 111 1 A23896: YFKDTHPKFPAGGKMSQYLESMGIGDTIEFRGPNGLLVYQGKGKFAIRPDSKSDPVIKTVKSVGMIAGGTGITPMLQVIRAIMKDPDDHTVCHLLF 112 207 823 889 917 ATNR: ANRTEEDILLREELDGWAIQYPDRLKVWYVVESAKEGWAYSTGFISEAIMREHIPDGLDGSALAIMACGPPPMIQFAVQPNLEKMQYNIKEDFLIF 111111111111 1 11 1 1 1 1 11 11 1 1I11 1 IlIIII II A23896: ANQTEKDILLRPELEELRDEHSARFKLWYTVDKAPEAWDYSQGFVNEEMIRDHLPPP-EEEPLVLMCGPPPMIQYACLPNLDRVG-HPKERCFAF I ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~I ~ ~~II 208 273 283 300 FIG. 4. Sequence comparison ofthe flavin-binding domain. The amino acid sequence of bovine NADH-cytochrome b5 reductase (A238%, ref. 25) and the A. thaliana nitrate reductase sequence (ATNR) were aligned to maximize similarities. Identical amino acid positions are indicated. DISCUSSION from squash (1) and barley (2). The predicted protein se- quence is >80% identical to the tobacco nitrate reductase We have isolated a cDNA encoding nitrate reductase from sequence within a 450 amino acid region (3). It is very likely Arabidopsis. The identity of the cDNA was established by that this similarity will extend over the entire protein once the RNA analysis and sequence comparison. The cDNA hybrid- complete tobacco sequence is available. izes to a 3.2-kb mRNA that is induced by nitrate. Identical It has been proposed that plant nitrate reductases have properties have been described for nitrate reductase mRNA three domains, one for each redox center (5). Three pros- thetic groups correspond to each redox center: FAD, heme, (A) 126 and molybdenum-pterin. By comparing the nitrate reductase 112 sequence to the sequence ofother proteins that bind the same ATNR: VRL-TGKHPFNSEAPL prosthetic groups and have similar properties, we have been III able to assign a functional domain to each of three regions of RLSO: LRINSQR-PFNAEPPP the protein (Fig. 6). A molybdenum-binding domain is as- (B) signed to the N-terminal halfofthe protein; the heme-binding 272 300 domain to the central region of the sequence, and the ATNR: AMDPSRDIILAYMQNGEYLTPDHGFPVRI FAD-binding domain is assigned to the C-terminal third ofthe 1iii III 11 1 11111 protein. The precise boundaries ofthe domains, if they exist, RLSO: AMDPQAEVLLAYEMNGQPLPRDSGFPVRV cannot be determined by this analysis. FIG. 5. Sequence comparisons of the molybdenum-binding The proteins uncovered in the homology search all belong domain. The amino acid sequences of two peptides of rat liver sulfite to the cytochrome b5 superfamily of proteins and their oxidase (RLSO) were aligned with the A. thaliana nitrate reductase associated reductases (21). These proteins have a cyto- sequence (ATNR) to maximize similarities. Identical amino acid chrome b5 fold (21) and reduce or oxidize cytochrome c. This positions are indicated. family includes microsomal, erythrocyte, and mitochondrial Downloaded by guest on September 24, 2021 5010 Biochemistry: Crawford et al. Proc. NatL Acad. Sci. USA 85 (1988) Molybdoreductase Cytochrome b Flavoreductase H-_-_- 1 I I

Molybdopterin Home FAD H2N-Met I I 5 I I Phe-C02H 1 110 125 270 300 540 620 640 917

H H lH llI sulfite oxidase cyt b5 cytochrome b5 reduetase MoCo cyt b2

FIG. 6. Schematic diagram of the domains of nitrate reductase. The polypeptide chain corresponding to one subunit of the native protein is represented. Similarities to other proteins are indicated beneath the line. Proposed regions for the different domains and their corresponding prosthetic groups are indicated above the line. cytochrome b5; sulfite oxidase; yeast flavocytochrome b2 2. Cheng, C., Dewdney, J., Kleinhofs, A. & Goodman, H. (1986) (lactate dehydrogenase); and Neurospora and tobacco nitrate Proc. Natl. Acad. Sci. USA 83, 6825-6828. reductase (3, 21, 27). The associated reductases include the 3. Calza, R., Huttner, E., Vincentz, M., Rouze, P., Galanyan, F., Vancheret, H., Cherel, I., Meyer, C., Kronenberger, J. & FAD-binding NADH-cytochrome b5 reductases. The FMN- Cabouche, M. (1987) Mol. Gen. Genet. 209, 552-562. binding domain offlavocytochrome b2 and the molybdenum- 4. Back, E., Burkhart, W., Moyer, M., Privalle, L. & Rothstein, binding domain of sulfite oxidase can also be considered as S. (1988) Mo!. Gen. Genet., 212, 20-26. associated reductases of the cytochrome b5 domain. It has 5. Campbell, W. (1986) in Biochemical Basis ofPlant Breeding, been proposed that an ancestral gene for cytochrome b5 was ed. Neyra, C. (CRC, Boca Raton, FL), Vol. 2, pp. 1-39. duplicated and shuffled together with other reductase- 6. Solomonson, L., Barber, M., Robbins, A. & Oaks, A. (1986) J. encoding genes during evolution (21). The Arabidopsis ni- Biol. Chem. 261, 11290-11294. trate belongs to this family, having a 7. Meyerowitz, E. & Pruitt, R. (1985) Science 229, 1214-1218. reductase protein 8. Braaksma, F. & Feenstra, W. (1982) Theor. Appl. Genet. 64, cytochrome b5-like domain. In addition, this protein has two 83-90. associated reductases fused with the cytochrome b5 domain: 9. Braaksma, F. & Feenstra, W. (1982) Physiol. Plant 54, 351-360. an FAD-binding domain similar to NADH cytochrome b5 10. Johnson, J. & Rajagopalan, K. (1977) J. Biol. Chem. 252, 2017- reductase and a molybdenum-binding domain similar to 2025. sulfite oxidase. It will be interesting to see whether these 11. Rajagopalan, K. V. (1980) in Molybdenum and Molybdenum- domains correspond to independent exons of the gene and Containing Enzymes, ed. Coughlan, M. (Pergamon, New whether they can be shuffled in vitro to construct proteins York), pp. 241-272. with different specificities and reactivities. 12. Huynh, T., Young, R. & Davis, R. (1985) in DNA Cloning, ed. nitrate reductase sequence also information Glover, D. (IRL, Arlington, VA), Vol. 1, pp. 49-78. The provides 13. Davis, R., Botstein, D. & Roth, J. (1980) Advanced Bacterial on another important aspect of the enzyme: the identity of a Genetics (Cold Spring Harbor Lab., Cold Spring Harbor, NY), critical cysteine residue. A single cysteine residue has been p. 174. found to react with N-ethylmaleimide in both nitrate reduc- 14. Anderson, S. (1981) Nucleic Acids Res. 9, 3015-3027. tase from Chlorella (28) and microsomal NADH-cytochrome 15. Sanger, F., Nicklen, S. & Coulson, A. (1977) Proc. Natl. Acad. b5 reductase (29). This modification blocks NADH binding Sci. USA 74, 5463-5467. and dehydrogenase activity. The enzyme can be protected 16. Kleinhofs, A., Warner, R. & Narayanam, K. (1985) in Oxford from modification by prior incubation with NADH. This Surveys of Plant Molecular Biology, ed. Miflin, B. (Oxford reactive cysteine was localized to residue 283 of cytochrome Univ. Press, New York), Vol. 2, pp. 91-122. When the nitrate reductase 17. Crawford, N. & Davis, R. (1988) in Current Topics in Plant b5 reductase (29). Arabidopsis Biochemistry and Physiology, ed. Randall, D. (Univ. Missouri, sequence is to the reductase se- compared cytochrome b5 Columbia, MO), Vol. 7, in press. quence, one finds that this residue is not conserved; the 18. Nobrega, F. & Ozols, J. (1971) J. Biol. Chem. 246, 1706-1717. Arabidopsis sequence has a valine at the corresponding 19. Lederer, F., Cortial, S., Becam, A., Haumont, P. & Perez, L. position (Fig. 4). However, another cysteine located nearby (1985) Eur. J. Biochem. 152, 419-428. (residue 273) is conserved in the Arabidopsis sequence (Fig. 20. Guiard, B. & Lederer, F. (1979) Eur. J. Biochem. 100, 441-453. 4). This conservation suggests that Cys-273 ofcytochrome b5 21. Guiard, B. & Lederer, F. (1979) J. Mol. Biol. 135, 639-650. reductase and Cys-889 of nitrate reductase are critical resi- 22. Lederer, F., Ghrir, R., Guiard, B., Cortial, S. & Ito, A. (1983) dues. The role ofthe cysteine residues is not known, but it has Eur. J. Biochem. 132, 95-102. are in NADH 23. Ozols, J., Gerard, C. & Nobrega, F. (1976) J. Biol. Chem. 251, been proposed that they involved binding (20) 6767-6774. or electron transfer (30). It will be interesting to focus 24. Tsugita, A., Kobayashi, M., Tani, S., Kyo, S., Rashid, M., structure-reactivity studies on the conserved cysteine to Yoshida, Y., Kajihara, T. & Hagihara, B. (1970) Proc. Natl. determine how it might participate in the redox reactions. Acad. Sci. USA 67, 442-447. We wish to thank K. V. Rajagopalan, Wilbur Campbell, Alan 25. Ozols, J., Korza, G., Heinemann, F., Hediger, M. & Stritt- Sachs, Allan Campbell, Peg Riley, Peter Bilous, Mike Innes, Jim matter, P. (1985) J. Biol. Chem. 260, 11953-11961. Kinghorn, Ian Johnston, Michele Caboche, and Steven Rothstein for 26. Yabisui, T., Miyata, T., Iwanaga, S., Tamura, M., Yoshida, S., helpful data and discussion. We thank Will Burkhart and Mary Takeshita, M. & Nakajima, H. (1984) J. Biochem. (Tokyo) 96, Moyer for technical help in sequencing peptides. This work was 579-582. supported by National Science Foundation Research Grant 1DBC 27. Le, K. & Lederer, F. (1983) EMBO J. 2, 1909-1914. U.S. of Research Grant DE-FG03- 28. Barber, M. & Solomonson, L. (1986) J. Biol. Chem. 261, 4562- 8402390 and Department Energy 4567. 84ER13265. N.M.C. was supported by a National Science Founda- tion Postdoctoral Research Fellowship in Plant Biology. 29. Hackett, C., Novoa, W., Ozols, J. & Strittmatter, P. (1986) J. Biol. Chem. 261, 9854-9857. 1. Crawford, N., Campbell, W. & Davis, R. (1986) Proc. Natl. 30. Adams, M. & Mortenson, L. (1985) in Molybdenum Enzymes, Acad. Sci. USA 83, 8073-8076. ed. Sporo, T. (Wiley, New York), Vol. 7, pp. 519-594. Downloaded by guest on September 24, 2021