Synthetic Genes for Glycoprotein Design and the Elucidation of Hydroxyproline-O-Glycosylation Codes
Total Page:16
File Type:pdf, Size:1020Kb
Synthetic genes for glycoprotein design and the elucidation of hydroxyproline-O-glycosylation codes Elena Shpak*, Joseph F. Leykam†, and Marcia J. Kieliszewski*‡ *Department of Chemistry and Biochemistry, Ohio University, Athens, OH 45701; and †Macromolecular Facility, Department of Biochemistry, Michigan State University, East Lansing, MI 48824 Communicated by Ronald R. Sederoff, North Carolina State University, Raleigh, NC, October 21, 1999 (received for review August 10, 1999) Design of hydroxyproline (Hyp)-rich glycoproteins (HRGPs) offers substituents precisely arranged along an extended Hyp-rich an approach for the structural and functional analysis of these wall polypeptide template, then an O-glycosylation code is likely. The components, which are broadly implicated in plant growth and Hyp contiguity hypothesis (8, 16, 17) correlates Hyp arabinosy- development. HRGPs consist of multiple small repetitive ‘‘glyco- lation with blocks of contiguous Hyp residues and predicts that modules’’ extensively O-glycosylated through the Hyp residues. Hyp galactosylation occurs on clustered noncontiguous Hyp The patterns of Hyp-O-glycosylation are putatively coded by the residues (8, 16). For example, contiguous Hyp4 blocks of the primary sequence as described by the Hyp contiguity hypothesis, Ser-Hyp4 glycomodules are extensively glycosylated with short which predicts contiguous Hyp residues to be attachment sites of chains comprised entirely of L-arabinose (1, 12). Similarly, small arabinooligosaccharides (1–5 Ara residues͞Hyp); while clus- dipeptidyl Hyp is the major arabinosylation site of a PRP in tered, noncontiguous Hyp residues are sites of arabinogalactan which noncontiguous Hyp was only rarely monoarabinosylated polysaccharide attachment. As a test, we designed two simple (17). On the other hand, we predict that Hyp galactosylation of HRGPs as fusion proteins with green fluorescent protein. The first clustered noncontiguous Hyp residues, such as the Xaa-Hyp- was a repetitive Ser-Hyp motif that encoded only clustered non- Xaa-Hyp repeats of AGPs, results in the addition of a galactan contiguous Hyp residues, predicted polysaccharide addition sites. core with side chains of arabinose and other sugars to form The resulting glycoprotein had arabinogalactan polysaccharide characteristic Hyp-arabinogalactan polysaccharides. Hitherto, O-linked to all Hyp residues. The second construct, based on the these sites of arabinogalactan polysaccharide attachment were consensus sequence of a gum arabic HRGP, contained both arabi- poorly defined. AGPs resist proteases, and degradation by nogalactan and arabinooligosaccharide addition sites and, as pre- partial alkaline hydrolysis yields arabinogalactan glycopeptides dicted, gave a product that contained both saccharide types. These that are difficult to purify. Therefore, as an approach to HRGP results identify an O-glycosylation code of plants. glycosylation site mapping and as a test of the code that directs Hyp-arabinogalactan polysaccharide addition, we designed a set rom green algae to flowering plants, hydroxyproline (Hyp)- of two synthetic genes that encode putative AGP glycomodules. FO-glycosylation uniquely characterizes an ancient and diverse Herein, we report that, when expressed and targeted for secre- group of structural glycoproteins associated with the cell wall (1). tion, these modules behaved as simple endogenous substrates for These Hyp-rich glycoproteins (HRGPs) are broadly implicated HRGP glycosyl transferases. The construct expressing noncon- in all aspects of growth and development, including fertilization tiguous Hyp showed exclusive polysaccharide addition, whereas (2, 3), differentiation and tissue organization (4), control of cell another construct containing noncontiguous Hyp and additional expansion growth (5), and responses to stress and pathogenesis contiguous Hyp showed both polysaccharide and arabinooligo- (6). However, our level of understanding their role is largely saccharide addition consistent with the predictions of the Hyp superficial and conjectural. The problem remains to be ex- contiguity hypothesis. plained as to why plants need so many HRGPs, what physio- logical roles they fulfill, and precisely how they fulfill them at the Materials and Methods molecular level (7). Synthetic Gene and Plasmid Construction. The signal sequence (Fig. HRGPs are generally extended, repetitive glycoproteins (8). 1) was modeled after an extensin signal sequence from Nicotiana The repeats are usually small, Ϸ5- to 16-residue motifs that are plumbaginifolia (18); mutually priming oligonucleotides were frequently highly glycosylated. Most HRGPs consist of more extended by T7 DNA polymerase, and the duplex was placed in than one type of repetitive motif. Thus, peptide sequence pUC18 as a BamHI–SstI fragment. Construction of a given periodicity and glycosylation distinguish the three major HRGP synthetic gene involved the polymerization of three sets of families: arabinogalactan proteins (AGPs), extensins, and pro- partially overlapping, complementary oligonucleotide pairs as line-rich proteins (PRPs). AGPs [Ͼ90% (wt͞wt) sugar] have described (ref. 19; Fig. 1). The following subclonings were ͞ repetitive variants of (Xaa-Hyp)n motifs (7) with O-linked required to create DNA fragments restriction sites, which al- arabinogalactan polysaccharides involving an O-galactosyl-Hyp lowed facile transfer of the signal sequence-synthetic gene- glycosidic bond (9, 10). Extensins [Ϸ50% (wt͞wt) sugar] have a enhanced green fluorescent protein (EGFP) unit to the plant diagnostic Ser-Hyp4 repeat that contains short oligosaccharides transformation vector pBI121 (ref. 20; CLONTECH); we placed of arabinose (Hyp arabinosides) involving an O-L-arabinosyl- the synthetic genes in pBluescript II SK(ϩ) (Stratagene) as Hyp linkage (1, 11). Finally, the lightly arabinosylated PRPs BamHI–EcoRI fragments and then subcloned the genes into [2–27% (wt͞wt) sugar; ref. 12] are the most highly periodic, pEGFP (CLONTECH) as BamHI–AgeI fragments preceding consisting largely of pentapeptide repeats, typically variants of the EGFP gene (21, 22). The synthetic gene–EGFP fragments Pro-Hyp-Val-Tyr-Lys (13, 14). Because repetitive HRGP glycopeptide motifs are evolution- arily conserved, we consider them small functional units and Abbreviations: Hyp, hydroxyproline; HRGP, Hyp-rich glycoprotein; AGP, arabinogalactan protein; PRP, proline-rich protein; GAGP, gum arabic glycoprotein; EGFP, enhanced green refer to them as ‘‘glycomodules.’’ HRGP glycosylation is signif- fluorescent protein. icant, because it defines the interactive molecular surface and ‡To whom reprint requests should be addressed. E-mail: [email protected]. hence should determine HRGP function as it does for other The publication costs of this article were defrayed in part by page charge payment. This extensively glycosylated glycoproteins (15). If the molecular article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. properties of these glycomodules depend on their glycosyl §1734 solely to indicate this fact. 14736–14741 ͉ PNAS ͉ December 21, 1999 ͉ vol. 96 ͉ no. 26 Downloaded by guest on September 29, 2021 Fig. 1. Oligonucleotide sets used to build the synthetic genes. Internal repeat oligonucleotide sets encoding Ser-Pro repeats or the gum arabic glycoprotein (GAGP) sequence were polymerized head-to-tail in the presence of the 5Ј linker set. After ligation, the 3Ј linker was added, and the genes were then restricted with BamHI and EcoRI and inserted into pBluescript II SK(ϩ). The signal sequence was built by primer extension of the overlapping oligonucleotides featured here. The overlap is underlined. were then subcloned into pBluescript II KS(ϩ) (Stratagene) as by Qi et al. (10). Endogenous tobacco AGPs were isolated as XmaI–NotI fragments, removed as XmaI–SstI fragments, and described (see Fig. 4). subcloned into pUC18 behind the signal sequence. DNA se- quences were confirmed by sequence analysis before insertion Coprecipitation with Yariv Reagent. We coprecipitated (Ser-  into pBI121 as BamHI–SstI fragments, replacing the -glucu- Hyp)32-EGFP, (GAGP)3-EGFP, tobacco AGPs, and native ronidase reporter gene. All constructs were under the control of GAGP with the Yariv reagent as described (24). the 35S cauliflower mosaic virus promoter. The oligonucleotides were synthesized by Life Technologies (Grand Island, NY). An Monosaccharide and Glycosyl Linkage Analysis. Monosaccharide Ala for Pro͞Hyp substitution at residue 8 of the GAGP internal compositions and linkage analyses were determined at the repeat module (Ser-Pro-Ser-Pro-Thr-Pro-Thr-Pro-Pro-Pro-Gly- Complex Carbohydrate Research Center, University of Georgia, Pro-His-Ser-Pro-Pro-Pro-Thr-Leu) was inadvertently intro- as described (25, 26). duced during synthesis by a G-for-C base substitution in the sense strand. Hyp-Glycoside Profiles. Hyp-glycoside profiles were determined as described by Lamport and Miller (11). We hydrolyzed 5.8–12.2 Tobacco Cell Transformation and Selection of Cell Lines. Suspension mg of (Ser-Hyp)32-EGFP or (GAGP)3-EGFP in 0.44 M NaOH cultured tobacco cells (Nicotiana tabacum, BY2) were trans- and neutralized the hydrolysate with 0.3 M HCl before injection formed (23) with Agrobacterium tumifaciens strain LBA4404 onto a C2 cation exchange column. containing the pBI121-derived plant transformation vector. Transformed cell lines were selected on solid Murashige–Skoog Anhydrous Hydrogen Fluoride Deglycosylation. We deglycosylated medium