Proc. Natl. Acad. Sci. USA Vol. 90, pp. 6829-6833, July 1993 Plant Biology A tobacco gene family for flower cell wall with a -rich domain and a -rich domain

HEN-MING WU, JITAO Zou, BRUCE MAY, QING Gu, AND ALICE Y. CHEUNG* Department of Biology, Yale University, P.O. Box 6666, New Haven, CT 06511 Communicated by J. E. Varner, March 29, 1993

ABSTRACT Flowering is known to be associated with the different cell types and at different developmental stages induction of many cell wall proteins. We report here five (8-10). A flower is morphologically complex and composed members of a tobacco gene family (CELP, Cys-rich extensin- of multiple tissue types (11). Flowering is known to stimulate like ) whose mRNAs are found predominantly in flowers or induce the expression of several cell-wall protein genes and encode extensin-like Pro-rich proteins. CELP mRNAs (12-18). These extracellular matrix proteins may be involved accumulate most abundantly in vascular and epidermal tissues in the organization of the floral apex from a vegetative apex of floral organs. In the pistil, CELP mRNAs also accumulate or may fulfill special structural and functional requirements in a thin layer of cells between the transmitting tissue and the for flower cell walls. Some flowering-associated extracellular cortex of the style and in a surface layer of cells of the placenta matrix proteins are also stress-inducible or pathogen-related in the ovary. This unique accumulation pattern of CELP and may contribute to the overall defense system in flowers mRNAs in the pistil suggests a possible role in pollination and (12, 19, 20). fertilization processes. CELP genes encode a class of plant We report here the characterization of a family of tobacco extracellular matrix proteins that have several distinct struc- genes and cDNAs (CELPs, Cys-rich extensin-like proteins) tural features: a Pro-rich extensin-like domain with Xaa-Pro3_7 for a set of proteins that have an extensin-like domain.t motifs and Xaa-Pro doublets, a Cys-rich region, and a highly CELP mRNAs accumulate almost exclusively in the four charged C terminus. The extensin-like domains in these pro- teins differ significantly in their length and these differences floral organs-sepal, petal, stamen, and pistil. CELP mRNAs appear to be results of both long and short deletions within the accumulate in multiple cell types in each floral organ. They coding regions of their genes. Furthermore, the number of are abundant in the vascular bundles and the epidermis, charged residues in the C-terminal region varies except in the pistil where they are most abundant in the cells among the CELPs. These structural differences may contribute demarcating the transmitting tissue and the cortex and in the to functional versatility in the CELPs. On the other hand, the cell layer lining the placenta to which ovules are attached. Cys-rich domain is highly conserved among CELPs and the The CELPs have three distinct structural domains: a Pro-rich positions ofthe Cys residues are conserved, suggesting that this extensin-like domain, a Cys-rich domain, and a highly region may have a common functional role. The presence of a charged C terminus. These features make the CELPs struc- Pro-rich domain and a Cys-rich domain in these CELPs is turally distinct from tobacco extensin (21). The possible reminiscent of a dass of hydroxyproline-rich glycoproteins, functional roles for this class of extracellular matrix proteins solanaceous lectins, that are believed to be important in will be discussed in light of the available structural and cell-cell recognition. The structure of these CELPs indicates expression information. that they may be multifunctional and that their genes may have arisen from recombinational events. MATERIALS AND METHODS The primary walls of plant cells are thin and flexible but strong and can accommodate cell expansion and changes in cDNA and Genomic Library Construction and Screening. cell shape as cells grow and differentiate. Besides providing CELP cDNA clones were isolated from a tobacco floral structural integrity, cell walls are also the sites where cell- cDNA library as described (20). A A Dash II genomic N. cell interactions as well as interactions between plants and tabacum DNA library was made as described (22). CELP their environment take place (1-5). Cell-wall proteins are genomic clones were isolated from an unamplified library diverse and include many proteins that are believed to play using a 32P-labeled CELP-lc probe prepared by random structural roles and others that are involved in defense or priming (22). Hybridization and washes were carried out at other cellular and biochemical processes. The best charac- 68°C in buffers as described (15). terized cell wall protein genes are those for several classes of Nucleotide Sequence Analysis. Nucleotide sequence analy- hydroxyproline (Hyp)-rich glycoproteins (HRGPs) (2, 6). sis was carried out by the dideoxynucleotide sequencing One class includes the extensins, which typically have nu- method using double-stranded DNA and Sequenase Version merous Ser-Pro(Hyp)4 motifs throughout the entire protein. 2 (United States Biochemical) according to the manufactur- Another class encodes what are collectively known as Pro All the were (Hyp)-rich proteins, which have multiple copies of the pen- er's recommendations. nucleotide sequences tapeptide Val-Tyr-Lys-Pro-Pro or its variants. There are at determined on both strands of DNA. least two additional classes of glycosylated Pro (Hyp)-rich RNA Expression Analysis. RNA preparation, gel electro- proteins, arabinogalactan proteins and solanaceous lectins. phoresis, blot analysis, and in situ hybridizations have been Gly-rich cell wall protein genes have also been reported (7). described (20). Differential expression of cell wall protein genes may contribute to meeting the functional and physical demands on Abbreviations: CELP, Cys-rich extensin-like protein; Hyp, hy- droxyproline; HRGP, Hyp-rich glycoprotein. *To whom reprint requests should be addressed. The publication costs ofthis article were defrayed in part by page charge tThe sequences reported in this paper have been deposited in the payment. This article must therefore be hereby marked "advertisement" GenBank data base (accession nos. CELP-1 to CELP-5, L13439- in accordance with 18 U.S.C. §1734 solely to indicate this fact. L13443, respectively). 6829 Downloaded by guest on September 24, 2021 6830 Plant Biology: Wu et al. Proc. Natl. Acad. Sci. USA 90 (1993) RESULTS Table 1. Amino acid composition of CELP-1, -2, -3, -4, and -5, tobacco extensin, and potato lectin CELP-1 and CELP-lc, a Gene and Its cDNA for an Extensin- Like Pro-Rich Protein. CELP-lc, a cDNA derived from a % of total residues class of highly expressed flower mRNAs, was isolated from CELP a tobacco floral cDNA library. CELP-lc has an open reading Residue 1 2 3 4 5 TE* PLt frame corresponding to a 209-aa Pro-rich protein (Fig. 1). The first of four closely spaced Met residues (NT 37-39) (Fig. 1) Ala 4.78 5.1 6.66 5.69 7.45 1.88 3.70 is assumed to function as the initiation codon of the deduced Arg 6.22 5.61 3.63 3.79 2.48 0 0.82 CELP-1 protein since the 5' end of the mRNA is located at Asn 2.87 2.55 3.63 2.53 2.48 0.31 5.34 the predominant transcription initiation site of the CELP-1 Asp 3.82 4.08 4.24 3.16 3.72 0 gene at the thymidine denoted NT 1 (data not shown). Cyst 6.70 6.63 7.87 8.22 8.07 0 11.5 The most striking feature of the deduced CELP-1 protein 9.87 1Q Ill 1Lu IILt is in the 74-aa central region that contains a series of Pro-rich Gln 2.87 3.57 2.42 3.79 3.72 0.31 6.99 sequences. The Pro residues are distributed in seven Xaa- Glu 5.26 5.61 6.06 5.69 5.59 0.31 0.95 1.02 1.81 1.26 2.48 0.31 11.5 Pro3 (Xaa being Trp, Cys, or Ser) and 18 Xaa-Pro Gly 5motifs His 0.478 0.51 0 0.63 0 4.71 0 being Trp, Phe, Cys, Arg, or Gln). Overall, Pro doublets (X Ile 3.34 3.06 5.45 3.79 4.96 0 1.23 are CELP-1 protein but they residues 26.3% of the entire Leu 5.74 5.61 7.87 8.86 9.31 2.20 1.23 make up 66.2% of the Pro-rich domain (see Table 1). The Lys 5.26 5.10 4.84 6.32 4.96 12.26 3.70 Xaa-Pro35 motifs in the predicted CELP are similar to the Met 2.39 3.57 1.81 1.26 2.48 0.628 0.41 Ser-Pro4 repeats found in many plant-cell-wall HRGPs (4, 6). Phe 2.39 2.04 4.24 3.16 2.48 0.31 0 CELP-1, especially its C-terminal half, is relatively Cys-rich Pro§ 26.3 26 26 18.9 19.25 39.93 28.3 (Fig. 1 and Table 1). Moreover, 9 ofthe last 11 aa are charged. I 67.1 65.7 A Characterization ofthe CELP-1 gene revealed a 400-bp intron Ser 8.13 7.14 7.27 6.96 6.83 10.06 11.1 that interrupts Gln196 and Val'97 (data not shown), thus Thr 2.87 3.06 3.63 4.43 3.10 5.03 6.58 separating the region encoding the highly charged C terminus Trp 2.39 2.55 0.60 1.89 2.48 0 3.29 of the CELP-1 from the rest of the gene. Tyr 2.39 2.04 4.24 3.16 3.72 14.46 3.29 CELP-1-Related Genes and cDNAs Encode Proteins with Val 4.78 5.10 3.6 6.32 4.34 7.23 0.41 Similar Structural Features. CELP-1 gene is a member of a Total 209¶ 196 1661 1591 161w 3181 24311 complex multiple gene family (data not shown). Several other TE, tobacco extensin; PL, potato lectin. CELP-lc-hybridizing cDNAs and genomic clones have also *Ref. 21. been characterized. CELP-lc, -2c, -3c, and -4c cDNAs and tRef. 23. tUnderlined numbers represent percent of Cys residues in the 1 8 M N N M L I M L C-terminal domain. tgt ata aca caa acc att cag cga aaa acg gca acc AG AAC AAT AG CIC ATA AmG TTA §Underlined numbers represent percent of Pro residues in the 61 28 M V A A F LF C S H Q Q V A T AR E V V Pro-rich domain. ATG GIG ¢r ¢CA TIT T3TG TGC A¢C CAC CAA CAA GrG GCC ACA G(G AGR GOA GrG GIT VTotal number of amino acid residues in unprocessed proteins. 121 48 IIEstimated total number. V A E L A V A D D R N E L Q L L W P W E TGM GCC GAA TIG GOC GMG GCC GAT GM A¢G ART GMTrGA CAR CMA CT TCG CCA TGG GAA 181 68 I P C Y L T W P F P W PP P PP W P C P the coding region of CELP-5 gene share significant sequence ATr CCA rGr TAT CmG ACA C0(A TIC TCG CCA COG CCA cc CCAG CEA TCC homology (data not shown). Although the five deduced 241 88 P P R P R P R P R P C P S P PP P P R P CELPs vary significantly in length (between 159 and 209 aa) CCT CCA Cr CCACGA CER CGA CCA CGR CCA TCCur AGC cur ccr CCA (A CrA CGA CrG (Fig. 2), they all have the same overall primary structure: a 301 108 R P C P S P P PP P R P R P C P S PP P Pro-rich domain, followed by a Cys-rich region and a highly C A 1E CC ACmC cuRACrT r. c r:. CGA OAC aGA TOC CTCT Cr T CCG C terminus. 361 128 charged PP Q P R P R P S P P P P S PP P P A P Similar to CELP-lc, CELP-2c and -3c appear to be full- (ERCA CAG CCACGRCEA CGA CCA ACC (Or QCA CA CRA T CCA CCC CCr (3CmC 421 148 length cDNAs whereas CELP-4c does not have an ATG S S S C S A S DE S N I Y R C M F N E T codon at the 5' end of its open reading frame. The N termini AGr A¢C TGC TCA GCr AGr GAT GAA TCA AAT ATT TA AGG T0 AGTiC AAC GAA ACr 481 168 of the deduced CELPs are hydrophobic and have features K I D P C C P T F K S I L G T S C P C Y characteristic of signal (Fig. 2) (24). A putative AAA ATT GAT CCA T1GC T3C CCA ACA TC AAG AmC ATA CIT ¢T AmT AGr TIC Cur I TAT 541 188 signal processing site is assigned in front of the K Y A E N L D N Q V L I T I E S YC D V conserved Gln22 (CELP-1) residue found in all five CELPs. AAR TAT GCA GAG AAT TIG GAT AAT CAA G(G TIA ATr Am AT OAA TCT TAT TGT GAT GTr 601 * * ** * * * * There is one potential N-glycosylation site (25) in CELP-1, -3, D S P C K G V Q*V I K L S K E E E K K K -4, and -5 (Fig. 2). Some of the Pro residues in the Pro-rich GAT AmC CCr ICC AAG GGT GIT CAA Gr0 ATr AMG CTG TCC AAG GAA GAG GA AAM AAM AM domain of CELPs are most probably hydroxylated and these K 209 Hyp residues may provide additional sites for glycosylation AAG taa aaa taa agt ttt aat gtt taa ttc cag atc att tat tag tag atg att tgc tat 721 (2, 4, 6). taa tta tcc ttt atg ttg aaa gtc tct agg tta tgt ttt tgg tct tcc ttg ttg tgt ttc 781 Sequence analysis ofCELP-5 gene revealed that it also has aga aat aat ttg cca tat atg tca aaa tct tag tac taa taa taa gaa tat tat aat ata an intron (391 bp) that, similar to the one in CELP-1 gene, 841 taa aat ctc tca tct interrupts the same Gln-Val dipeptide close to the 3' end of agaaaaa the coding region. Recently, the nucleotide sequences oftwo FIG. 1. Nucleotide sequence of the CELP-1 cDNA and the partial cDNAs for extensin-like flower-specific genes deduced amino acid sequence of the CELP-1. Arrow, predominant (pMG02 and pMG04) have been described (13). These partial transcription initiation site of the CELP-1 mRNA; triangle, putative sequences corresponded to part of the sequences found in signal peptide processing site; diamond, location of the intron in CELP-lc and CELP-2c. Whether MG02 and MG04, and CELP-1 precursor mRNA. Initiation and termination codons are in italic type. The Pro-rich domain is underlined. The charged amino CELP-lc and -2c are for the same genes remains to be acid residues at the C terminus are indicated by asterisks. The determined. putative polyadenylylation signal is overlined. The numbers on the The Pro-Rich Domains of the CELPs Show Significant left and right indicate first NT positions and the last amino acid Length and MotifPattern Variations. The Pro-rich domains in positions, respectively, of each row. the CELPs are flanked by a conserved tetrapeptide (Trp- Downloaded by guest on September 24, 2021 Plant Biology: Wu et Plantal. Biology: Wu et al. ~~~~~Proc. Nati. Acad. Sci. USA 90 (1993) 6831 conserved between CELP-1 and -3 whereas an entirely EL'p-1 v MVAAFLFCS[ GQVATAREW VAELAVAEC NCU=kJE 48 was CELP-2 QQVAThREW VA----hR 1NHOIkJE~ 43 different hexapeptide found in CELP-2 and -4. The CE.P-3 LVAAnECS- -QVAEAREM. LAN-----tDG NELQ0-IFPFfl 43 hexapeptide in CELP-5 has 3 aa that are identical to those in CELP-4 PVKGLL'IL LV--ILECSH QQVATAREVA %%V---rG NEL.QL-WIFJ 41 IAm4QLL'IL t1?AILEV CLT7EMLV VA--- £QG 5QL-WEWK 44 CELP-1 and -3 and 3 aa that are identical to those in CELP-2 and -4. From the nucleotide sequences encoding these CE.P-1 T.TnnnnrT.TfVln -- RPRPrP.qPPP Icy4 WPPPPFWPCP RPCPSPPPPP 96 hexapeptides (Fig. 3), it could be argued that the hexapeptide CELP-2 IrCLJF PP-RmPRPRP RPRFCPSPPP 93 CELP-3 IFCLJF WP--RPWTCP -PPP ----CPPPPPPP in CELP-5 was derived from those in CELP-1 and -3 via three CELP-4 75 IFCLJF FP--RPYFCP PPRPRPRP-- --CPPPPPPP single-base-pair changes that resulted in changes in 3 aa. The ------SPPPPPPP-.1 RP--CPDPPP 74 nucleotide sequences in CELP-5 could then have undergone additional changes in 3 bp to yield a total of the 6 aa PPRFRCPSP PPFFPPSPPPP---SP PP A ENIYRC 143 CELP-2 PP------QPRPP SPPPPPPPSP PPP ASKV1OC 131 substitutions seen in CELP-2 and CELP-3. CEL,P-3 ------CPPPP---sS P p ASOESIYW 102 CELPI-4 ------CPPPP---- A ir- 97 Cys-Rich Domains in CELPs Are Highly Conserved. The CELP-5 p p ------CPP---c p &;EAKIM 98 Cys-rich regions of the CELPs are highly conserved (Fig. 2). Most strikingly, the eight Cys residues in this region are in CE.P-1 NFEIKEDC SCPCYKYAED CPIFKSn.ar SYCUVDSPCK 193 CELP-2 TFN SCPCYKYAED LaQvLrnLE AYCDUDSPCK 181 conserved positions in these proteins. This structural con- CXLP-3 NFTE=fGDC CPIFKSIt0r cPTh-KSIIa AYCDFSPK 152 servation suggests that the Cys-rich domain of CELPs may CELP-4 04ENITSILEW SCPCYKYAEN AYCDVDSFCK 147 CP CELP-5 iaerTUI9C LGVELIALQ A=L 148 have common functional significance. The C Terminus of CELPs Is Highly Charged and Is Coded 209 in a Separate Exon in CELP Genes. The CELP C terminus is 196 CE1LP-3 GLQIIMISM EY-K 166 defined as the region encoded by the second exon of the CELP-4 GLtO.IICSKE EE 159 CELP-1 and CELP-5 genes beglnning with the last conserved CELP-5 OLCU R5N LE 160 Val residue (Fig. 2). More than half of the C-terminal amino FIG. 2. Deduced amino acid sequences of CELP-1, -2, -3, -4, and acid residues in this region are charged. Despite the similar -5. The first amino acid residues shown are either the first Met codon charge property, each CELP is unique in the composition or (in CELP-1c, -2c, -3c, and -5) or the first codon in the 5' end of a the total number of charged amino acid residues in this cDNA sequence (in CELP-4c). The Pro-rich regions are boxed. region, thus providing variability in this region ofthe CELPs. Conserved amino acid residues among all five proteins on the N- and CELPs Are Different from Extensins but Their Primary C-terminal sides of the Pro-rich domain are underlined. The con- Structure Is Reminiscent of Solanaceous Lectins. A striking served Cys residues in the C-terminal half of CELPs are indicated by difference between CELPs and extensins is the low Tyr asterisks. The charged amino acid residues at the C termini are in content and the absence of His residues in the italic type. The potential glycosylation sites in CELP-1, -3, -4, and -S presumed proteins are indicated by dots. Gaps (dashes) are introduced to allow mature CELPs (Fig. 2 and Table 1). Tyr residues in extensins maximum alignment of these proteins. The numbers on the right are closely associated with the Ser-Pro4 motifs and can form indicate, amino acid position for each of the CELPs. Triangle, isodityrosine linkages via intramolecular crosslinking (26, putative signal peptide processing site; diamond, position of intron in 27), rendering them highly insoluble in the cell-wall matrix. CELP-1 and CELP-5 genes. On the contrary, CELPs are relatively rich in Cys residues whereas extensins are in general low in or devoid of this Pro-Phe-Pro) and an Ala-Pro doublet at the N and C termini, amino acid residue (Table 1). A number of these Cys groups respectively (Fig. 2). The Pro content in these proteins is are found within the Xaa-Pro3-7 motifs and among the Xaa- between 19 and 26% for the total protein and between 65 and Pro doublets. Some of these Cys residues may form disulfide 71% for the Pro-rich domains (Table 1). The Xaa-Pro3-7 MOtifS bonds under the proper oxidative-reductive conditions and in CELPs differ from the Ser-Pro4 MOtifS in extensins in that may be important participants in inter- and intramolecular Xaa may be Ser, Cys, or Trp. Despite the significant homol- interactions involving the CELPs in the extracellular matrix. ogy, CELPs differ among themselves in the number and in On the other hand, the amino acid compositions of the the characteristics of the Xaa-Pro3-7 motifs and the Xaa-Pro CELPs, the presence of distinct Pro-rich and Cys-rich do- doublets (Fig. 2). The Pro-rich domains in these proteins mains, and their solubility properties (A.Y.C. and H.-mn.W., range from 35 to 74 aa. Most of the differences appear to unpublished data) are reminiscent of the solanaceous lectins, result from deletions of relatively long regions of DNA. Some a class of Hyp-rich proteins that are also rich in Ser and Cys minor insertions and deletions might be responsible for the (and Gly) residues (Table 1) (23, 28, 29). Solanaceous lectins shorter range heterogeneity. These differences adequately are structurally very different from other known lectins and account for the overall length variations among the CELPs. are believed to be important to cell-cell interactions. Striking homology exists in a pentapeptide and an octapep- CELP mRNAs Are Predominant in Flowers and Accumulate tide flanking the Pro-rich regions of the CELPs (Fig. 2). in Specific Cell Types, Especially in the Pistil. CELP mRNAs Interestingly, a highly variable 6-aa domain is located on the accumulate almost exclusively in the four floral parts (Fig. 4 C-terminal side of the conserved octapeptides after the A and B). Their levels are very high throughout flower Pro-rich regions (Figs. 2 and 3). This domain was 100% development. They begin to decline by anthesis (Fig. 4C) and are greatly reduced in developing and mature fruits (data not CETP-1 GAA rA AAT ATr A'LX shown). In situ hybridization of CELP probes to tissue E S N I Y R 142 sections indicates that CELP mRNAs accumulate in a cell- CELP-3 GAA WCA MAT ATr rIAC ALX E S N I Y R 101 specifi'c manner in floral tissues (Fig. 5). CELP mRNA levels CEbP-5 CGAA gCA AAg ATT TAC AaG are the highest in the epidermis and vascular bundles ofpetals E A K I Y K 97 (Fig. 5A) and sepals (data not shown). In the stamen, CELP CELP-2 cA~A gCA gTr aAg AaG mRNAs are confined to the vascular bundles of the filament Q A K V K 130 (Fig. 5C), whereas there are only low levels of these mRNAs CELP-4 cA'A gCA g q aAg AaG in the connective and vascular tissues of anthers (data not Q A K V K 96 shown). The most dramatic cell-specific CELP mRNA ac- cumulation pattern is observed in the pistil. In addition to FIG. 3. Nucleotide sequences encoding the highly variable region in the vascular C-terminal to the Pro-rich domains of CELPs. Base-pair changes are being present bundles, CELP mRNAs accu- shown in lowercase type. Amino acid substitutions are shown in mulate to high levels in a narrow region between the trans- boldface and outlined types. Numbers at the end of each line indicate mitting tissue and the cortex of the style (Fig. 5 D and F). In the position of the last amino acid residue shown for each protein. the ovary, CELP mRNAs concentrate in a narrow row of Downloaded by guest on September 24, 2021 6832 Plant Biology: Wu et al. Proc. Natl. Acad. Sci. USA 90 (1993)

A A.i -: a ..,irg d;Sh.f t C I S L F -4L .t. L ..:E r.# s 'vf_-o. .} .L ...... w W+',,. B R3 :: so Ps St Pi ......

.:.: :: D E F

l4lIn. C"_

G H w FIG. 4. RNA blot analysis of CELP mRNA accumulation pat-

terns. (A) RNA from root (lane R), stem (lane S), leaves (lane L), and ...... flowers (lane F). Prolonged exposure of this autoradiogram revealed t \ ,OY a very low level ofCELP mRNAs in leaves. (B) Sepal (Se), petal (Pe), , I stamen (St), and pistil (Pi) RNA from flowers at stage 5-8 (30). (C) .aV If RNA from various floral developmental stages. The number above J each lane designates floral developmental stage (30). These blots ...ov...... JP were hybridized with 32P-labeled CELP-lc DNA probe. Other CELPc probes yielded identical results.

cells lining the placenta that are a continuation from the stylar transmitting tissue to which ovules are attached (Fig. 5 G and J). FIG. 5. In situ hybridization analysis of cellular accumulation patterns of CELP mRNAs. Flowers at stage 5-6 were used. Bright- DISCUSSION field micrographs of flower sections hybridized with 35S-labeled CELP antisense RNA (A,C,D,F,G, and J) and control sense RNA We have described here a family of genes and cDNAs (B,E,H, and I). (A and B) Cross sections of petal. (x 140.) The cells (CELP) for a set of proteins with an extensin-like Pro-rich without silver grain deposition in B reacted with tissue stain more domain, a Cys-rich domain, and a highly C-terminal efficiently than cells shown in A and so have a much darker outline. charged (C) Cross section of a filament. (x55.) (D and E) Longitudinal domain (Fig. 2). CELP mRNAs accumulate almost exclu- sections of a pistil. (x20.) The dark area in the vascular bundle in E sively in flowers and they are present in diverse cell types. was from autofluorescence of the vascular tissues. (F) Cross section Protein blot analysis and subcellular localization of CELPs ofa pistil. (x55.) (G and H) Longitudinal sections ofan ovary. (x 15.) indicate that they are glycoproteins localized largely to the (I) Cross section of a pistil. (x20.) (J) Higher magnification of one cell walls (A.Y.C. and H.-m.W., unpublished data), proper- region of the ovary section shown in G. (x55.) ep, epidermis; vb, ties similar to many other HRGPs. However, the unique vascular bundle; co, cortex; tt, transmitting tissue; pl, placenta; ov, structure of these floral CELPs (Fig. 2) and the expression ovule. pattern for their genes (Figs. 4 and 5) distinguish them from multifunctional. The variabilities observed in individual several other female or male reproductive organ-specific Pro structural regions among the CELPs may be able to modulate (Hyp)-rich proteins (13-17). the functions mediated by them. The differences in the Structurally, being Pro-rich proteins with low Tyr con- tents, the CELPs must rely on chemical reactions other than Xaa-Pro3 7 motifs and their interruptions by different lengths the irreversible isodityrosine linkages presumably utilized by of Xaa-Pro doublets may affect the extent and the pattern of the Tyr-rich HRGPs to integrate into the cell-wall network glycosylation in the CELPs, thus further amplifying their (26, 27). It is likely that the Cys residues play a role in this differences. These may then provide even more structural aspect. In the Pro-rich domains, there are a number of variability and functional versatility to this family ofproteins Cys-Pro3 7 motifs and Cys-Pro doublets (Fig. 2). These Cys as they are incorporated into the extracellular matrix of the residues should be able to participate in disulfide bond different floral cell types. formation under the proper oxidative-reductive conditions, The C-terminal half of the CELPs has eight Cys residues allowing for interactions among CELPs, other Cys groups, (=10% of the residues) (Fig. 2 and Table 1), which are at and sulfated nonproteinaceous components in the walls. conserved positions in these proteins. The Cys residues Since disulfide bonds are reversible, interactions mediated by inevitably contribute to the structure of CELPs, which these Cys residues are thus labile and may change as devel- should be important to their activities. The possibility that opmental, functional, and environmental demands on the CELPs may have a role in cell-cell interactions should be flower change. considered especially in light of their similarity to the sola- The complexity of the CELP gene family, its diverse naceous lectins, which are soluble HRGPs with a Pro-rich cellular expression pattern in floral tissues, and the presence domain and a Cys-rich domain. It is known that the sugar- of distinct structural regions in CELPs suggest that they are binding property of potato lectin resides in the Cys-rich Downloaded by guest on September 24, 2021 Plant Biology: Wu et al. Proc. Natl. Acad. Sci. USA 90 (1993) 6833 domain and disruption of disulfide bonds interferes with this how they contribute to the properties necessary for the activity (28). Because of their ability to recognize other extracellular matrix of floral tissues. glyco-moieties, and their locations, both cellular and extra- cellular (29, 31), it has been suggested that solanaceous We thank Dr. Jean Haley for her comments on the manuscript and lectins have important roles in cell-cell interactions. The Mrs. Nancy Carrignan for her patient assistance in the preparation unique expression pattern of CELP mRNA accumulation in of this manuscript. This work was supported by a grant from the stylar cells demarcating the transmitting tissue suggests a McKnight Foundation to Yale University. J.Z. was a postdoctoral possible role in restricting the path of pollen tube growth to fellow supported by the Rockefeller Foundation. within the transmitting tissue. Accumulation of CELP gene products on the placenta surface cells suggests that they may 1. Varner, J. E. & Lin, L.-S. (1989) Cell 56, 231-239. 2. Roberts, K. (1989) Curr. Opin. Cell Biol. 1, 1020-1027. contribute to ovule development or to pollen tube entrance to 3. Roberts, K. (1990) Curr. Opin. Cell Biol. 2, 920-928. the ovules. 4. Cassab, G. I. & Vamer, J. E. (1988) Annu. Rev. Plant Physiol. Presence of several distinct structural domains suggest that Plant Mol. Biol. 39, 321-353. CELPs may not be entirely embedded in the fibrous wall 5. Lamb, C. J., Lawton, M. A., Dron, M. & Dixon, R. P. (1989) matrix. Different domains of these proteins may be available Cell 56, 215-224. for interactions with structures contiguous to the cell walls. 6. Showalter, A. M. & Varner, J. E. (1989) in The Biochemistry of It is especially interesting to note the presence of a highly Plants: A Comprehensive Treatise, ed. Marcu, A. (Academic, charged C terminus in these CELPs (Fig. 2). These charged New York), Vol. 15, pp. 485-520. groups may be available to interact with other charged 7. Condit, C. M. & Keller, B. (1990) in Organization and Assem- moieties bly ofPlant andAnimal Extracellular Matrix, eds. Adair, W. S. in the extracellular matrix. Moreover, the variability & Mecham, R. P. (Academic, New York), pp. 119-135. observed in the C termini among the CELPs should allow 8. Hong, J. C., Nagao, R. T. & Key, J. L. (1989) Plant Cell 1, more versatility for these interactions. Furthermore, the 937-943. highly charged C termini in CELPs are similar to a HRGP 9. Ye, Z.-H. & Varner, J. E. (1991) Plant Cell 3, 23-37. from Volvox, which has five negatively charged amino acid 10. Wyatt, R. E., Nagao, R. T. & Key, J. L. (1992) Plant Cell 4, residues followed by five positively charged residues close to 99-110. its C terminus (32). This C-terminal region was implicated in 11. Esau, K. (1977) Anatomy ofSeed Plants (Wiley, New York). the biogenesis of the extracellular matrix during Volvox 12. Neale, A. D., Wahleithner, J. A., Lund, M., Bonnett, H. T., embryogenesis. Although the significance of the highly Kelly, A., Meeks-Wagner, D. R., Peacock, W. J. & Dennis, E. S. (1990) Plant Cell 2, 673-684. charged C termini in the CELPs remains to be determined, it 13. De S. Goldman, M. H., Pezzotti, M., Seurinck, J. & Mariani, is likely that this domain has a specific functional role in C. (1992) Plant Cell 4, 1041-1051. flower extracellular matrix. 14. Chen, C.-G., Cornish, E. D. & Clarke, A. E. (1992) Plant Cell Because of the motif similarity shared among many Pro- 4, 1053-1062. rich proteins, it had been suggested that recombination 15. Cheung, A. Y., May, B., Gu, Q. & Wu, H.-M. (1993) Plant J. played a role in the evolution of these cell wall protein genes 3, 151-160. (6, 33). It has also been suggested that an ancestral cytosine- 16. Baldwin, T. C., Coen, E. S. & Dickinson, H. G. (1992) Plant rich DNA sequence for an extensin gene existed. Gene J. 2, 733-739. duplication and subsequent recombinations resulted in the 17. Evrard, J.-L., Jako, C., Saint-Guily, A., Weil, J.-H. & Kuntz, M. (1991) Plant Mol. Biol. 16, 271-281. complex array of gene families encoding the many Pro-rich 18. Ori, N., Sessa, G., Lotan, T., Himmelhoch, S. & Fluhr, R. proteins. It was also suggested that the solanaceous lectins (1990) EMBO J. 9, 3429-3436. may have arisen from the recombination between a sequence 19. Fraser, R. S. S. (1981) Physiol. Plant Pathol. 19, 69-76. encoding a Pro-rich protein and one encoding a Cys-rich 20. Gu, Q., Kawata, E. E., Morse, M.-J., Wu, H.-M. & Cheung, protein. The structure of the CELP genes exemplifies these A. Y. (1992) Mol. Gen. Genet. 234, 89-96. hypotheses. The Pro-rich regions and the Cys-rich regions in 21. Memelink, J. (1988) Dissertation (University of Leiden, The the CELPs could be a result of recombination between two Netherlands). distinct DNA fragments. The length variability in the Pro-rich 22. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular domains (Fig. 2) reflects possible unequal cross-over in this Cloning: A Laboratory Manual (Cold Spring Harbor Lab. Press, Plainview, NY), 2nd Ed. region that resulted in large deletions in some of these genes. 23. Van Holst, G.-J., Martin, S. R., Allen, A. K., Ashford, D., Smaller insertions and deletions can account for the shorter- Desai, N. N. & Neuberger, A. (1986) Biochem. J. 233,731-736. range variations (see Fig. 2). It is probable that recombino- 24. von Heijne, G. (1986) Nucleic Acids Res. 14, 4683-4690. genic activities within the G+C-rich DNA sequences played 25. Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K. & a significant role in the evolution of the Pro-rich domains in Watson, J. (1989) Molecular Biology ofthe Cell (Garland, New CELPs. In addition to recombination, exon shuffling (34), a York), 2nd Ed. mechanism believed to be important to the evolution of 26. Fry, S. C. (1986) Annu. Rev. Plant Physiol. 37, 165-186. proteins with multiple functional domains might be respon- 27. Epstein, L. & Lamport, D. T. A. (1984) Phytochemistry 23, sible in constructing the highly charged C-terminal domains 1242-1246. in 28. Allen, A. K. (1983) in Chemical Taxonomy, MolecularBiology, these proteins. and Function of Plant Lectins, eds. Goldstein, I. J. & Etzler, The many functional roles for cell walls require the phys- M. E. (Liss, New York), pp. 71-85. ical, biological, and chemical complexity observed for the 29. Casalongde, C. & Pont Lezica, R. (1985) Plant Cell Physiol. 26, extracellular matrix of plant cells. While a vast amount of 1533-1539. information is available concerning the biological properties 30. Koltunow, A. M., Treuttner, J., Cox, K. H., Wallroth, M. & of the plant cell surface and the chemical properties of its Goldberg, R. B. (1990) Plant Cell 2, 1201-1224. constituents (1-3), how the cell walls are constructed and 31. Jeffree, C. E. & Yeoman, M. M. (1981) New Phytol. 87, how their functions are mediated remain largely a puzzle. The 463-471. many roles that some of the cell wall proteins 32. Ertl, H., Hallmann, A., Wenzl, S. & Sumper, M. (1992) EMBO are believed to J. 11, 2055-2062. play in plant growth, development, and its defense remain to 33. Showalter, A. M. & Rumeau, D. (1990) in Organization and be unequivocally demonstrated. The perplexing issues that Assembly ofPlant andAnimal Extracellular Matrix, eds. Adair, have been addressed here concerning the CELP gene family W. S. & Mecham, R. P. (Academic, New York), pp. 247-281. will help focus efforts toward unraveling the structural and 34. Gilbert, W. (1987) Cold Spring Harbor Sym. Quant. Biol. 52, functional significance of these proteins and understanding 901-906. Downloaded by guest on September 24, 2021