<<

chapter

Carbohydrates and Glycobiology 9

Carbohydrates are the most abundant biomolecules on Earth. Each year, photosynthesis converts more than 100 billion metric tons of CO2 and H2O into and other plant products. Certain carbohydrates ( and ) are a dietary staple in most parts of the world, and the oxidation of carbohydrates is the central energy-yielding pathway in most nonphoto- synthetic cells. Insoluble polymers serve as structural and protective elements in the cell walls of bacteria and plants and in the con- nective tissues of animals. Other carbohydrate polymers lubricate skeletal joints and participate in recognition and adhesion between cells. More com- plex carbohydrate polymers covalently attached to proteins or lipids act as signals that determine the intracellular location or metabolic fate of these hybrid molecules, called glycoconjugates. This chapter introduces the major classes of carbohydrates and glycoconjugates and provides a few ex- amples of their many structural and functional roles. Carbohydrates are predominantly cyclized polyhydroxy aldehydes or ketones, or substances that yield such compounds on hydrolysis. Many, but not all, carbohydrates have the empirical formula (CH2O)n; some also con- tain nitrogen, phosphorus, or sulfur. There are three major size classes of carbohydrates: , , and (the word “saccharide” is derived from the Greek sakcharon, meaning “sugar”). Monosaccharides, or sim- ple , consist of a single polyhydroxy aldehyde or ketone unit. The most abundant in nature is the six-carbon sugar D-, sometimes referred to as dextrose. Oligosaccharides consist of short chains of monosaccharide units, or residues, joined by characteristic linkages called glycosidic bonds. The most abundant are the , with two monosaccharide units. Typical is , or cane sugar, which consists of the six-carbon sugars D-glucose Cotton bolls ready for harvest. The cotton fiber consists of and D-. All common monosaccharides and disaccharides have cellulose, a with physical and chemical names ending with the suffix “-ose.” In cells, most oligosaccharides having properties that give cotton textiles their desirable qualities. three or more units do not occur as free entities but are joined to nonsugar The cotton plant has been cultivated and used in textile production for thousands of years. molecules (lipids or proteins) in glycoconjugates. Sugar polymers occur in a continuous range of sizes. Those containing more than about 20 monosaccharide units are generally called polysac- charides. Polysaccharides may have hundreds or thousands of monosac- charide units. Some polysaccharide molecules, such as cellulose, are linear chains, whereas others, such as , are branched chains. The plant products starch and cellulose both consist of recurring units of D-glucose, but they differ in the type of glycosidic linkage, and consequently have strikingly different properties and biological roles.

293 294 Part II Structure and Catalysis

Monosaccharides and Disaccharides The simplest of the carbohydrates, the monosaccharides, are either alde- hydes or ketones with one or more hydroxyl groups; the six-carbon mono- saccharides glucose and fructose have five hydroxyl groups. The carbon atoms to which hydroxyl groups are attached are often chiral centers, which give rise to the many sugar stereoisomers found in nature. We begin by describing the families of monosaccharides with backbones of three to seven carbons—their structure and stereoisomeric forms, and the means of representing their three-dimensional structures on paper. We then discuss several chemical reactions of the carbonyl groups of monosaccharides. One such reaction, the addition of a hydroxyl group from within the same mole- cule, generates the cyclic forms of five- and six-carbon sugars (which pre- dominate in aqueous solution) and creates a new chiral center, adding further stereochemical complexity to this class of compounds. The nomen- clature for unambiguously specifying the configuration about each carbon atom in a cyclic form and the means of representing these structures on pa- per are therefore described in some detail; these will be useful later in our discussion of the metabolism of monosaccharides. We also introduce here some important monosaccharide derivatives that will be encountered in later chapters.

The Two Families of Monosaccharides Are and Monosaccharides are colorless, crystalline solids that are freely soluble in water but insoluble in nonpolar solvents. Most have a sweet taste. The back- bones of common monosaccharide molecules are unbranched carbon chains in which all the carbon atoms are linked by single bonds. In the open- chain form, one of the carbon atoms is double-bonded to an oxygen atom to form a carbonyl group; each of the other carbon atoms has a hydroxyl group. If the carbonyl group is at an end of the carbon chain (i.e., in an alde- hyde group) the monosaccharide is an ; if the carbonyl group is at any other position (in a ketone group) the monosaccharide is a . The simplest monosaccharides are the two three-carbon : glyceralde- hyde, an aldotriose, and , a ketotriose (Fig. 9–1a). Monosaccharides with four, five, six, and seven carbon atoms in their backbones are called, respectively, , , , and hep- toses. There are aldoses and ketoses of each of these chain lengths: al- dotetroses and ketotetroses, aldopentoses and ketopentoses, and so on. figure 9–1 The hexoses, which include the aldohexose D-glucose and the ketohexose Representative monosaccharides. (a) Two trioses, an D-fructose (Fig. 9–1b), are the most common monosaccharides in nature. aldose and a ketose. The carbonyl group in each is The aldopentoses D- and 2-deoxy-D-ribose (Fig. 9–1c) are compo- shaded. (b) Two common hexoses. (c) The nents of nucleotides and nucleic acids (Chapter 10). components of nucleic acids. D-Ribose is a component of ribonucleic acid (RNA), and 2-deoxy-D-ribose is a component of deoxyribonucleic acid (DNA).

H O H C H C OH H O H O H C OH C O C C H O H C H C OH HO C H HO C H H C OH CH2 H C OH C O H C OH H C OH H C OH H C OH A A HCOH HCOH H C OH H C OH H C OH H C OH

H H CH2OH CH2OH CH2OH CH2OH , Dihydroxyacetone, D-Glucose, D-Fructose, D-Ribose, 2-Deoxy-D-ribose, an aldotriose a ketotriose an aldohexose a ketohexose an aldopentose an aldopentose (a) (b) (c) Chapter 9 Carbohydrates and Glycobiology 295

Monosaccharides Have Asymmetric Centers All the monosaccharides except dihydroxyacetone contain one or more asymmetric (chiral) carbon atoms and thus occur in optically active iso- meric forms (pp. 58–61). The simplest aldose, glyceraldehyde, contains one chiral center (the middle carbon atom) and therefore has two different op- tical isomers, or enantiomers (Fig. 9–2). By convention, one of these two forms is designated the D isomer, the other the L isomer. As for other biomolecules with chiral centers, the absolute configurations of sugars are known from x-ray crystallography. To represent three-dimensional sugar structures on paper, we often use Fischer projection formulas (Fig. 9–2).

Mirror CHO CHO HCOH HO C H

CH2OH CH 2OH CHO CHO D-Glyceraldehyde L-Glyceraldehyde

Fischer projection formulas H OH H OH

CHO CHO H C OH HO C H CH2OH CH OH 2 CH 2OH CH2OH D-Glyceraldehyde L-Glyceraldehyde

Ball-and-stick models Perspective formulas

figure 9–2 Three ways to represent the two stereoisomers of glycer- aldehyde. The stereoisomers are mirror images of each other. Ball-and-stick models show the actual configuration n of molecules. By convention, in Fischer projection for- In general, a molecule with n chiral centers can have 2 stereoisomers. mulas, horizontal bonds project out of the plane of the 1 Glyceraldehyde has 2 2; the aldohexoses, with four chiral centers, have paper, toward the reader; vertical bonds project behind 24 16 stereoisomers. The stereoisomers of monosaccharides of each car- the plane of the paper, away from the reader. Recall (see bon chain length can be divided into two groups that differ in the configu- Fig. 3–7) that in perspective formulas, solid wedge- ration about the chiral center most distant from the carbonyl carbon. shaped bonds point toward the reader, dashed wedges point away. Those in which the configuration at this reference carbon is the same as that of D-glyceraldehyde are designated D isomers, and those with the same configuration as L-glyceraldehyde are L isomers. When the hydroxyl group on the reference carbon is on the right in the projection formula, the sugar is the D isomer; when on the left, it is the L isomer. Of the 16 possible aldo- hexoses, eight are D forms and eight are L. Most of the hexoses of living or- ganisms are D isomers. Figure 9–3 shows the structures of the D stereoisomers of all the al- doses and ketoses having three to six carbon atoms. The carbons of a sugar are numbered beginning at the end of the chain nearest the carbonyl group. Each of the eight D-aldohexoses, which differ in the stereochemistry at C-2, C-3, or C-4, has its own name: D-glucose, D-, D-, and so forth (Fig. 9–3a). The four- and five-carbon ketoses are designated by Three carbons Four carbons Five carbons

H O H O H O H O H O H O C C C C H O C C H C OH HO C H H C OH HO C H C H C OH HO C H HOHC HOHC HO C H HO C H H C OH H C OH H C OH H C OH H C OH H C OH H C OH

CH2OH CH2OH CH2OH CH2OH CH2OH CH2OH CH2OH

D-Glyceraldehyde D- D- D-Ribose D- D- D-

Six carbons

H O H O H O H O H O H O H O H O C C C C C C C C H C OH HO C H H C OH HO C H H C OH HO C H H C OH HO C H H C OH H C OH HO C H HO C H H C OH H C OH HO C H HO C H HOHC HOHC HOHC H C OH HO C H HO C H HOC H HO C H H C OH H C OH H C OH H C OH H C OH H C OH H C OH H C OH

CH2OH CH2OH CH2OH CH2OH CH2OH CH2OH CH2OH CH2OH

D- D- D-Glucose D-Mannose D- D- D-Galactose D-

D-Aldoses (a) figure 9–3 The series of (a) D-aldoses and (b) D-ketoses having from Three carbons Four carbons Five carbons three to six carbon atoms, shown as projection formulas.

The carbon atoms in red are chiral centers. In all these CH2OH CH2OH D isomers, the chiral carbon most distant from the carbonyl carbon has the same configuration as the chiral carbon in C O C O CH2OH D-glyceraldehyde. The sugars named in boxes are the CH2OH H C OH HO C H most common in nature; we shall encounter these again C O in this and later chapters. C O H C OH H C OH H C OH CH2OH CH2OH CH2OH CH2OH D D Dihydroxyacetone D- - -

Six carbons

CH2OH CH2OH CH2OH CH2OH C O C O C O C O H C OH HO C H HHOC OH C H H C OH H C OH HO C H HO C H H C OH H C OH H C OH H C OH

CH2OH CH2OH CH2OH CH2OH

D- D-Fructose D- D-

D-Ketoses (b)

296 Chapter 9 Carbohydrates and Glycobiology 297

1 1 1 CHO CHO CHO figure 9–4 2 2 2 Epimers. D-Glucose and two of its epimers are shown as HO C H H C OH H C OH projection formulas. Each epimer differs from D-glucose in 3 3 3 HO C H HO C H HO C H the configuration at one chiral center (shaded red). 4 4 4 H C OH H C OH HO C H 5 5 5 HCOH HCOH HCOH 6 6 6 CH2OH CH2OH CH2OH D-Mannose D-Glucose D-Galactose (epimer at C-2) (epimer at C-4)

inserting “ul” into the name of a corresponding aldose; for example, D-ribu- lose is the ketopentose corresponding to the aldopentose D-ribose. The ke- tohexoses are named otherwise, such as fructose (from Latin fructus, meaning “fruit”; fruits are rich in this sugar) and sorbose (from Sorbus, the H O genus of mountain ash, which has berries rich in the related sugar alcohol GJ C sorbitol). Two sugars that differ only in the configuration around one car- A bon atom are called epimers; D-glucose and D-mannose, which differ only HOCOOH A in the stereochemistry at C-2, are epimers, as are D-glucose and D-galactose HOOCOH (which differ at C-4) (Fig. 9–4). A HOOCOH Some sugars occur naturally in their L form; examples are L-arabinose A and the L isomers of some sugar derivatives (discussed below) that are com- CH2OH mon components of glycoconjugates. L-Arabinose

The Common Monosaccharides Have Cyclic Structures For simplicity, we have thus far represented the structures of various al- doses and ketoses as straight-chain forms (Figs. 9–3, 9–4). In fact, in aque- ous solution, aldotetroses and all monosaccharides with five or more carbon atoms in the backbone occur predominantly as cyclic (ring) structures in which the carbonyl group has formed a covalent bond with the oxygen of a hydroxyl group along the chain. The formation of these ring structures is the result of a general reaction between aldehydes or ketones and alcohols to form derivatives called hemiacetals or hemiketals (Fig. 9–5), which contain an additional asymmetric carbon atom and thus can exist in two stereoisomeric forms. For example, D-glucose exists in solution as an intra- molecular hemiacetal in which the free hydroxyl group at C-5 has reacted

3 3 O OH HO R OR 1 2 1 2 1 2 R C HO R R C OR R C OR H2O H 3 H HO R H Aldehyde Alcohol Hemiacetal Acetal figure 9–5 Formation of hemiacetals and hemiketals. An aldehyde 4 4 or ketone can react with an alcohol in a 11 ratio to yield OH HO R OR a hemiacetal or hemiketal, respectively, creating a new 1 3 1 3 1 3 chiral center at the carbonyl carbon. Substitution of a R C O HO R R C OR R C OR H2O second alcohol molecule produces an acetal or ketal. 2 2 4 2 R R HO R R When the second alcohol is part of another sugar mole- Ketone Alcohol Hemiketal Ketal cule, the bond produced is a (p. 301). 298 Part II Structure and Catalysis figure 9–6 H O Formation of the two cyclic forms of D-glucose. Reac- tion between the aldehyde group at C-1 and the hydroxyl 1 C group at C-5 forms a hemiacetal linkage, producing either 2 H C OH of two stereoisomers, the a and b , which differ 3 only in the stereochemistry around the hemiacetal HO C H D-Glucose carbon. The interconversion of a and b anomers is called 4 . H C OH 5 H C OH 6 CH2OH

6 CH2OH 5 C OH H H H 4 C C 1 OH H HO O C C 3 2 H OH

6 6 CH2OH CH2OH 5 C O 5 C O H H H OH H H 4 C 1C 4 C 1C OH H mutarotation OH H HO OH HO H C C C C 3 2 3 2 H OH H OH -D-Glucopyranose -D-Glucopyranose

with the aldehydic C-1, rendering the latter asymmetric and producing two stereoisomers, designated a and b (Fig. 9–6). These six-membered ring compounds are called because they resemble the six-mem- bered ring compound pyran (Fig. 9–7). The systematic names for the two ring forms of D-glucose are a-D-glucopyranose and b-D-glucopyranose. Aldohexoses also exist in cyclic forms having five-membered rings, which, because they resemble the five-membered ring compound furan, are called . However, the six-membered aldopyranose ring is much more stable than the aldofuranose ring and predominates in aldohexose so- lutions. Only aldoses having five or more carbon atoms can form rings.

6 CH2OH CH2OH 5 O O HC O H H H H H OH 4 1 HC CH OH H OH H HO OH HO H 3 2 H2C CH H OH H OH figure 9–7 -D-Glucopyranose -D-Glucopyranose Pyran Pyranoses and furanoses. The pyranose forms of 6 D-glucose and the forms of D-fructose are HOCH O 1CH OH HOCH O OH O shown here as Haworth perspective formulas. The edges 2 2 2 of the ring nearest the reader are represented by bold 5 H HO 2 H HO HC CH lines. Hydroxyl groups below the plane of the ring in H OH H CH2OH 4 3 C C these Haworth perspectives would appear at the right H H side of a Fischer projection (compare with Fig. 9–6). OH H OH H Pyran and furan are shown for comparison. -D-Fructofuranose -D-Fructofuranose Furan Chapter 9 Carbohydrates and Glycobiology 299

Axis Axis Axis ax ax ax H eq O eq H2COH O eq ax ax O eq HO H ax eq H eq eq eq HO H eq eq OH ax ax ax ax H OH

Two possible chair forms -D-Glucopyranose (a) (b) figure 9–8 Conformational formulas of pyranoses. (a) Two chair Isomeric forms of monosaccharides that differ only in their configura- forms of the pyranose ring. Substituents on the ring tion about the hemiacetal or hemiketal carbon atom are called anomers. carbons may be either axial (ax), projecting parallel with The hemiacetal or carbonyl carbon atom is called the anomeric carbon. the vertical axis through the ring, or equatorial (eq), pro- jecting roughly perpendicular to this axis. Generally, sub- The a and b anomers of D-glucose interconvert in aqueous solution by a stituents in the equatorial positions are less sterically hin- process called mutarotation. Thus a solution of a-D-glucose and a solution dered by neighboring substituents, and conformations of b-D-glucose eventually form identical equilibrium mixtures having identi- with their bulky substituents in equatorial positions are cal optical properties. This mixture consists of about one-third a-D-glucose, favored. Another conformation, the “boat” (not shown), is only seen in derivatives with very bulky substituents. two-thirds b-D-glucose, and very small amounts of the linear and five- (b) A chair conformation of a-D-glucopyranose. membered ring (glucofuranose) forms. Ketohexoses also occur in a and b anomeric forms. In these compounds the hydroxyl group on C-5 (or C-6) reacts with the keto group at C-2, form- ing a furanose (or pyranose) ring containing a hemiketal linkage (Fig. 9–5). D-Fructose forms predominantly the furanose ring (Fig. 9–7); the more common in combined forms or derivatives is b-D-fructofuranose. Haworth perspective formulas like those in Figure 9–7 are com- monly used to show the ring forms of monosaccharides. However, the six- membered pyranose ring is not planar, as Haworth perspectives suggest, but tends to assume either of two “chair” conformations (Fig. 9–8). Recall from Chapter 3 that two conformations of a molecule are interconvertible without the breakage of covalent bonds, but two configurations can be in- terconverted only by breaking a covalent bond—for example, in the case of a and b configurations, the bond involving the ring oxygen atom. The spe- cific three-dimensional conformations of the monosaccharide units are im- portant in determining the biological properties and functions of some poly- saccharides, as we shall see.

Organisms Contain a Variety of Derivatives In addition to simple hexoses such as glucose, galactose, and mannose, there are a number of sugar derivatives in which a hydroxyl group in the parent compound is replaced with another substituent, or a carbon atom is oxidized to a carboxyl group (Fig. 9–9). In glucosamine, galactosamine, and mannosamine, the hydroxyl at C-2 of the parent compound is replaced with an amino group. The amino group is nearly always condensed with acetic acid, as in N-acetylglucosamine. This glucosamine derivative is part of many structural polymers, including those of the bacterial cell wall. Bacterial cell walls also contain a derivative of glucosamine called N-acetylmuramic acid, in which lactic acid (a three-carbon carboxylic acid) is ether-linked to the oxygen at C-3 of N-acetylglucosamine. The substitution of a hydrogen for the hydroxyl group at C-6 of L-galactose or L-mannose produces L- or L-, respectively; these deoxy sugars are found in plant polysac- charides and in the complex components of and glycolipids described later. When the carbonyl (aldehyde) carbon of glucose is oxidized to the car- boxyl level, gluconic acid is produced; other aldoses yield other aldonic acids. Oxidation of the carbon at the other end of the carbon chain—C-6 300 Part II Structure and Catalysis

Glucose family

Amino sugars

CH2OH CH2OH CH2OH CH2OH CH2OH O O O O O H H OH H H OH H H OH HO H OH H H OH H2N OH H OH H OH H OH H OH HO H HO H HO H H H HO H

H OH H NH2 H NH H NH2 H H C O -D-Galactosamine -D-Mannosamine

CH3 -D-Glucose -D-Glucosamine N-Acetyl--D-glucosamine Deoxy sugars

2 CH2 O PO3 CH2OH CH2OH H H O O O CH O O H OH H OH H OH 3 H H HO OH H H H CH3 CH3 R O C H OH H R H R H H OH H H OH H H HO H HO H HO H COO HO

H OH H NH2 H NH OH H OH OH C O

CH3 -D-Glucose 6-phosphate Muramic acid N-Acetylmuramic acid -L-Fucose -L-Rhamnose

Acidic sugars

O O CH3 O O CH OH C 2 CH2OH O C H C R O OH O O H COH H O H H OH H H H HN R C O H COH OH H OH H OH H H H HO H HO O HO H OH CH2OH

H OH H OH H OH OH H -D-Glucuronate D-Gluconate D-Glucono--lactone N-Acetylneuraminic acid (sialic acid)

figure 9–9 Some hexose derivatives important in biology. In amino sugars, an XNH2 group replaces one of the XOH groups of glucose, galactose, or mannose—forms the corresponding uronic acid: in the parent hexose. Substitution of XH for XOH pro- glucuronic, galacturonic, or mannuronic acid. Both aldonic and uronic acid duces a deoxy sugar. Note that the deoxy sugars shown here occur in nature as the L isomers. The acidic sugars form stable intramolecular esters called lactones (Fig. 9–9, lower left). In contain a carboxylate group, which confers a negative addition to these acidic hexose derivatives, one nine-carbon acidic sugar charge at neutral pH. D-Glucono-d-lactone results from deserves mention: N-acetylneuraminic acid (sialic acid), a derivative of N- formation of an ester linkage between the C-1 carboxylate acetylmannosamine, is a component of many glycoproteins and glycolipids group and the C-5 (also known as the d carbon) hydroxyl in animals. The carboxylic acid groups of the acidic sugar derivatives are group of D-gluconate. ionized at pH 7, and the compounds are therefore correctly named as car- boxylates—glucuronate, galacturonate, and so forth. In the synthesis and metabolism of carbohydrates, the intermediates are very often not the sugars themselves but their phosphorylated deriva- tives. Condensation of phosphoric acid with one of the hydroxyl groups of a sugar forms a phosphate ester, as in glucose 6-phosphate (Fig. 9–9). Sugar phosphates are relatively stable at neutral pH and bear a negative Chapter 9 Carbohydrates and Glycobiology 301 charge. One effect of sugar phosphorylation within cells is to trap the sugar inside the cell; cells do not in general have plasma membrane transporters for phosphorylated sugars. Phosphorylation also activates sugars for subse- quent chemical transformation. Several important phosphorylated deriva- tives of sugars are discussed in the next chapter.

Monosaccharides Are Reducing Agents Monosaccharides can be oxidized by relatively mild oxidizing agents such as ferric (Fe3) or cupric (Cu2) ion (Fig. 9–10a). The carbonyl carbon is ox- idized to a carboxyl group. Glucose and other sugars capable of reducing ferric or cupric ion are called reducing sugars. This property is the basis of Fehling’s reaction, a qualitative test for the presence of . By measuring the amount of oxidizing agent reduced by a solution of a sugar, it is also possible to estimate the concentration of that sugar. For many years, this test was used to detect and measure elevated glucose lev- els in blood and urine in the diagnosis of diabetes mellitus. Now, more sen- sitive methods for measuring blood glucose employ an enzyme, glucose ox- idase (Fig. 9–10b).

6 CH2OH figure 9–10 5 O Sugars as reducing agents. (a) Oxidation of the anomeric H OH carbon of glucose and other sugars is the basis for H 41 Fehling’s reaction. The cuprous ion (Cu ) produced OH H HO H under alkaline conditions forms a red cuprous oxide pre- 3 2 cipitate. In the hemiacetal (ring) form, C-1 of glucose H OH cannot be oxidized by Cu2 . However, the open-chain -D-Glucose form is in equilibrium with the ring form, and eventually the oxidation reaction goes to completion. The reaction with Cu2 is not as simple as the equation here implies; in addition to D-gluconate, a number of shorter-chain acids are produced by the fragmentation of glucose. H O O O (b) Blood glucose concentration is commonly determined GJ GJ 1C C by measuring the amount of H2O2 produced in the reac- 2 A A tion catalyzed by glucose oxidase. In the reaction mixture, HOOC OH HOOC OH a second enzyme, peroxidase, catalyzes reaction of the A 2 A 3 2Cu 2Cu H O with a colorless compound to produce a colored HOOCOH HOOCOH 2 2 compound, the amount of which is then measured spec- 4 A A HOCOOH HOCOOH trophotometrically. 5 A A HOOC OH HOOC OH A A 6 CH2OH CH2OH D-Glucose D-Gluconate (linear form)

(a)

glucose oxidase D-Glucose O2 D-Glucono-d-lactone H2O2

(b)

Disaccharides Contain a Glycosidic Bond Disaccharides (such as , , and sucrose) consist of two mono- saccharides joined covalently by an O-glycosidic bond, which is formed when a hydroxyl group of one sugar reacts with the anomeric carbon of the other (Fig. 9–11). This reaction represents the formation of an acetal from a hemiacetal (such as glucopyranose) and an alcohol (a hydroxyl group of a second sugar molecule) (Fig. 9–5). When an anomeric carbon participates 302 Part II Structure and Catalysis

CH2OH CH2OH in a glycosidic bond, it cannot be oxidized by cupric or ferric ion. The sugar O hemiacetal O containing the anomeric carbon atom cannot exist in linear form and no HHH H H OH longer acts as a reducing sugar. In describing disaccharides or polysaccha- OH H OH H rides, the end of a chain with a free anomeric carbon (i.e., not involved in a HO OH HO H glycosidic bond) is commonly called the reducing end. Glycosidic bonds H OH alcohol H OH are readily hydrolyzed by acid, but resist cleavage by base. Thus disaccha- -D-Glucose -D-Glucose rides can be hydrolyzed to yield their free monosaccharide components by boiling with dilute acid. Another type of glycosidic bond joins the anomeric hydrolysis condensation carbon of a sugar to a nitrogen atom in glycoproteins (see Fig. 9–25). These H O H O 2 2 N-glycosyl bonds are also found in all nucleotides (Chapter 10). 6 6 CH2OH CH2OH The maltose (Fig. 9–11) contains two D-glucose residues 5 5 O acetal O hemiacetal joined by a glycosidic linkage between C-1 (the anomeric carbon) of one H H H H H OH glucose residue and C-4 of the other. Because the free anomeric carbon 4141 OH H OH H (C-1 of the glucose residue on the right in Fig. 9–11) can be oxidized, mal- H HO O tose is a reducing disaccharide. The configuration of the anomeric carbon 3 2 3 2 H OH H OH atom in the glycosidic linkage is a. The glucose residue with the free Maltose anomeric carbon is capable of existing in a- and b-pyranose forms. -D-glucopyranosyl-(1n4)-D-glucopyranose To name reducing disaccharides such as maltose unambiguously, and es- pecially to name more complex oligosaccharides, several rules are followed. figure 9–11 By convention, the name describes the compound with its nonreducing end Formation of maltose. A disaccharide is formed from two monosaccharides (here, two molecules of D-glucose) to the left, and the name is “built up” in the following order. (1) The config- when an XOH (alcohol) of one glucose molecule (right) uration (a or b) at the anomeric carbon joining the first monosaccharide unit condenses with the intramolecular hemiacetal of the other (on the left) to the second is given. (2) The nonreducing residue is named. glucose molecule (left), with the elimination of H2O and To distinguish five- and six-membered ring structures, “furano” or “pyrano” formation of an O-glycosidic bond. The reversal of this is inserted into the name. (3) The two carbon atoms joined by the glycosidic reaction is hydrolysis—attack by H2O on the glycosidic bond. The maltose molecule retains a reducing hemi- bond are indicated in parentheses, with an arrow connecting the two num- acetal at the C-1 not involved in the glycosidic bond. bers; for example, (1n4) shows that C-1 of the first-named sugar residue is Because mutarotation interconverts the a and b forms of joined to C-4 of the second. (4) The second residue is named. If there is a the hemiacetal, the bonds at this position are sometimes third residue, the second glycosidic bond is described next, by the same con- depicted with wavy lines to indicate that the structure ventions. (To shorten the description of complex polysaccharides, three-let- may be either a or b. ter abbreviations for each monosaccharide are often used, as given in Table 9–1.) Following this convention for naming oligosaccharides, maltose is a-D- glucopyranosyl-(1n4)-D-glucopyranose. Because most sugars encountered in this book are the D enantiomers and the pyranose form of hexoses pre- dominates, we will generally use a shortened version of the formal name of such compounds, giving the configuration of the anomeric carbon and nam- ing the carbons joined by the glycosidic bond. In this abbreviated nomencla- ture, maltose is Glc(a1n4)Glc.

table 9–1 Abbreviations for Common Monosaccharides and Some of Their Derivatives Abequose Abe Glucuronic acid GlcA Arabinose Ara Galactosamine GalN Fructose Fru Glucosamine GlcN Fucose Fuc N-Acetylgalactosamine GalNAc Galactose Gal N-Acetylglucosamine GlcNAc Glucose Glc Muramic acid Mur Mannose Man N-Acetylmuramic acid Mur2Ac Rhamnose Rha N-Acetylneuraminic acid Neu5Ac Ribose Rib (sialic acid) Xylose Xyl Chapter 9 Carbohydrates and Glycobiology 303

D D 6 6 The disaccharide lactose (Fig. 9–12), which yields -galactose and - CH2OH CH2OH glucose on hydrolysis, occurs naturally only in milk. The anomeric carbon 5 O 5 O of the glucose residue is available for oxidation, and thus lactose is a re- HO H H H OH 41 O 41 ducing disaccharide. Its abbreviated name is Gal(b1n4)Glc. Sucrose (table OH H OH H sugar) is a disaccharide of glucose and fructose. It is formed by plants but H H H 3 2 3 2 not by higher animals. In contrast to maltose and lactose, sucrose contains H OH H OH no free anomeric carbon atom; the anomeric carbons of both monosaccha- Lactose ( form) ride units are involved in the glycosidic bond (Fig. 9–12). Sucrose is there- -D-galactopyranosyl-(1n4)--D-glucopyranose fore not a reducing sugar. Nonreducing disaccharides are named as glyco- Gal(1n4)Glc sides; the positions joined are the anomeric carbons. In the abbreviated 6CH OH nomenclature, a double-headed arrow connects the symbols specifying the 2 1 O HOCH anomeric carbons and their configurations. For example, the abbreviated 5 2 O H H H H name of sucrose is either Glc(a1mn2b)Fru or Fru(b2mn1a)Glc. Sucrose is a 4 1 2 5 OH H H HO major intermediate product of photosynthesis; in many plants it is the prin- HO CH2OH O 6 cipal form in which sugar is transported from the leaves to other parts of 3 2 3 4 OH H the plant body. , Glc(a1mn1a)Glc (Fig. 9–12), is a disaccharide of H OH Sucrose D-glucose that, like sucrose, is a nonreducing sugar. It is a major constituent -D-fructofuranosyl -D-glucopyranoside of the circulating fluid (hemolymph) of insects, in which it serves as an en- Fru(2 nn 1)Glc ergy storage compound. 6 CH2OH H O O 5 H 5 H OH Polysaccharides H HOCH2 41 O 1 6 4 Most carbohydrates found in nature occur as polysaccharides, polymers of OH H OH H HO H H medium to high molecular weight. Polysaccharides, also called , dif- 3 2 2 3 fer from each other in the identity of their recurring monosaccharide units, H OH H OH in the length of their chains, in the types of bonds linking the units, and in Trehalose the degree of branching. Homopolysaccharides contain only a single type -D-glucopyranosyl -D-glucopyranoside Glc(1nn 1 )Glc of monomer; heteropolysaccharides contain two or more different kinds (Fig. 9–13). Some homopolysaccharides serve as storage forms of mono- figure 9–12 saccharides used as fuels; starch and glycogen are homopolysaccharides of Some common disaccharides. Like maltose in Figure this type. Other homopolysaccharides (cellulose and , for example) 9–11, these are shown as Haworth perspectives. The serve as structural elements in plant cell walls and animal exoskeletons. common name, full systematic name, and abbreviation are given for each disaccharide. Heteropolysaccharides provide extracellular support for organisms of all kingdoms. For example, the rigid layer of the bacterial cell envelope (the peptidoglycan) is composed in part of a heteropolysaccharide built from two alternating monosaccharide units. In animal tissues, the extracellular

Heteropolysaccharides

Homopolysaccharides Two monomer Multiple types, monomer types, Unbranched Branched unbranched branched

figure 9–13 Polysaccharides may be composed of one, two, or several different monosaccharides, in straight or branched chains of varying length. 304 Part II Structure and Catalysis

space is occupied by several types of heteropolysaccharides, which form a matrix that holds individual cells together and provides protection, shape, and support to cells, tissues, and organs. Unlike proteins, polysaccharides generally do not have definite molec- ular weights. This difference is a consequence of the mechanisms of as- sembly of the two types of polymers. As we shall see in Chapter 27, proteins are synthesized on a template (messenger RNA) of defined sequence and length, by enzymes that follow the template exactly. For polysaccharide synthesis, there is no template; rather, the program for polysaccharide syn- thesis is intrinsic to the enzymes that catalyze the polymerization of the monomeric units.

Starch and Glycogen Are Stored Fuels The most important storage polysaccharides are starch in plant cells and glycogen in animal cells. Both polysaccharides occur intracellularly as large clusters or granules (Fig. 9–14). Starch and glycogen molecules are heavily hydrated because they have many exposed hydroxyl groups available to hy- drogen bond with water. Most plant cells have the ability to form starch, but it is especially abundant in tubers, such as potatoes, and in seeds, such as those of maize (corn). Starch contains two types of glucose polymer, and amy- lopectin (Fig. 9–15). The former consists of long, unbranched chains of D- glucose residues connected by (a1n4) linkages. Such chains vary in mole- cular weight from a few thousand to over a million. also has a high molecular weight (up to 100 million) but unlike amylose is highly branched. The glycosidic linkages joining successive glucose residues in amylopectin chains are (a1n4); the branch points (about one per 24 to 30 residues) are (a1n6) linkages. Glycogen is the main storage polysaccharide of animal cells. Like amy- lopectin, glycogen is a polymer of (a1n4)-linked subunits of glucose, with (a1n6)-linked branches, but glycogen is more extensively branched (on average, one branch per 8 to 12 residues) and more compact than starch. Glycogen is especially abundant in the liver, where it may constitute as figure 9–14 much as 7% of the wet weight; it is also present in skeletal muscle. In he- Electron micrographs of starch and glycogen granules. patocytes glycogen is found in large granules (Fig. 9–14b), which are them- (a) Large starch granules in a single chloroplast. Starch is selves clusters of smaller granules composed of single, highly branched made in the chloroplast from D-glucose formed photosyn- glycogen molecules with an average molecular weight of several million. thetically. (b) Glycogen granules in a hepatocyte. These Such glycogen granules also contain, in tightly bound form, the enzymes re- granules form in the cytosol and are much smaller sponsible for the synthesis and degradation of glycogen. (0.1 mm) than starch granules (1.0 mm).

Starch granules

Glycogen granules (a) (b) 6 CH2OH CH2OH CH2OH CH2OH 5 O O O O H H H H H H H H Nonreducing H H H H Reducing 4 1 41 4 1 4 1 end OH H OH H OH H OH H end O O O O O 3 2 H OH H OH H OH H OH

(a)

6 CH2OH O H H H 4 1 OH H O ( 1n6) branch Branch H OH point O A 6 CH2 O H H H 4 1 OH H O O Main H OH chain

(b)

figure 9–15 Amylose Amylose and amylopectin, the polysaccharides of starch. (a) A short segment of amylose, a linear polymer of D-glucose residues in (a1n4) linkage. A single chain Reducing can contain several thousand glucose residues. Amy- ends lopectin has stretches of similarly linked residues between branch points. (b) An (a1n6) branch point of amy- Nonreducing lopectin. (c) A cluster of amylose and amylopectin like ends that believed to occur in starch granules. Strands of amy- Amylopectin lopectin (red) form double-helical structures with each other or with amylose strands (blue). Glucose residues at the nonreducing ends of the outer branches are removed enzymatically, one at a time, during the intracellular mobi- lization of starch for energy production. Glycogen has a similar structure but is more highly branched and more (c) compact.

Because each branch in glycogen ends with a nonreducing sugar (one without a free anomeric carbon), a glycogen molecule has as many nonre- ducing ends as it has branches, but only one reducing end. When glycogen is used as an energy source, glucose units are removed one at a time from the nonreducing ends. Degradative enzymes that act only at nonreducing ends can work simultaneously on the many branches, speeding the conver- sion of the polymer to monosaccharides. Why not store glucose in its monomeric form? It has been calculated that hepatocytes store glycogen equivalent to a glucose concentration of 0.4 M. The actual concentration of glycogen, which is insoluble and con- tributes little to the osmolarity of the cytosol, is about 0.01 mM. If the cytosol contained 0.4 M glucose, the osmolarity would be threateningly elevated, leading to osmotic entry of water that might rupture the cell (see Fig. 4– 11). Furthermore, with an intracellular glucose concentration of 0.4 M and an external concentration of about 5 mM (the concentration in the blood of a mammal), the free-energy change for glucose uptake into cells against this very high concentration gradient would be prohibitively large.

305 OH HO O O CH2OH

O

CH2OH HO

O

HO O

(1n4)-linked D-glucose units

(a) figure 9–16 The structure of starch (amylose). (a) In the most stable conformation of adjacent rigid chairs, the polysaccharide chain is curved, rather than linear as in cellulose (see Fig. 9–17). (b) Scale drawing of a segment of amylose. (b) The conformation of (a1n4) linkages in amylose, amy- lopectin, and glycogen causes these polymers to assume tightly coiled helical structures. These compact structures produce the dense granules of stored starch or glycogen seen in many cells (see Fig. 9–14). The three-dimensional structure of starch is shown in Figure 9–16, and it is compared with the structure of cellulose below.

Cellulose and Chitin Are Structural Homopolysaccharides Cellulose, a fibrous, tough, water-insoluble substance, is found in the cell walls of plants, particularly in stalks, stems, trunks, and all the woody por- tions of the plant body. Cellulose constitutes much of the mass of wood, and cotton is almost pure cellulose. Like amylose and the main chains of amy- lopectin and glycogen, the cellulose molecule is a linear, unbranched ho- mopolysaccharide, consisting of 10,000 to 15,000 D-glucose units. But there is a very important difference: in cellulose the glucose residues have the b configuration (Fig. 9–17), whereas in amylose, amylopectin, and glycogen the glucose is in the a configuration. The glucose residues in cellulose are linked by (b1n4) glycosidic bonds. This difference gives cellulose and amy- lose very different three-dimensional structures and physical properties. The three-dimensional structure of carbohydrate macromolecules is based on the same principles as those underlying polypeptide structure: subunits with a more-or-less rigid structure dictated by covalent bonds form three-dimensional macromolecular structures that are stabilized by weak interactions. Because polysaccharides have so many hydroxyl groups, hy- drogen bonding has an especially important influence on their structures. Polymers of b-D-glucose, such as cellulose, can be represented as a series of rigid pyranose rings connected by an oxygen atom bridging two carbon atoms (the glycosidic bond). There is free rotation about both CXO bonds linking the two residues (Fig. 9–17a). The most stable conformation for the polymer is that in which each chair is turned 180 relative to its neighbors, yielding a straight, extended chain. With several chains lying side by side, a stabilizing network of interchain and intrachain hydrogen bonds produces straight, stable supramolecular fibers of great tensile strength (Fig. 9–17b). The tensile strength of cellulose has made it a useful substance to civiliza- tions for millenia. Many manufactured products, including paper, card- board, rayon, insulating tiles, and other packing and building materials, are derived from cellulose. The water content of these materials is low because extensive interchain hydrogen bonding between cellulose molecules satis- fies their capacity for hydrogen-bond formation.

306 Chapter 9 Carbohydrates and Glycobiology 307

OH OH HO 6 O 4 O 5 2 O O O HO OH 1 OH 3 (1n4)-linked D-glucose units

(a)

(b)

In contrast to the straight fibers produced by (b1n4)-linked polymers figure 9–17 such as cellulose, the most favorable conformation for (a1n4)-linked poly- The structure of cellulose. (a) Two units of a cellulose mers of D-glucose, such as starch and glycogen, is a tightly coiled helical chain; the D-glucose residues are in (b1n4) linkage. The structure stabilized by hydrogen bonds (Fig. 9–16). rigid chair structures can rotate relative to one another. (b) Scale drawing of segments of two parallel cellulose Glycogen and starch ingested in the diet are hydrolyzed by a-amylases, chains, showing the conformation of the D-glucose enzymes in saliva and intestinal secretions that break (a1n4) glycosidic residues and the hydrogen-bond cross-links. In the bonds between glucose units. Most animals cannot use cellulose as a fuel hexose unit at the lower left, all hydrogen atoms are source because they lack an enzyme to hydrolyze the (b1n4) linkages. Ter- shown; in the other three hexose units, all hydrogens mites readily digest cellulose (and therefore wood), but only because their attached to carbon have been omitted for clarity as they do not participate in hydrogen bonding. intestinal tract harbors a symbiotic microorganism, Trichonympha, that secretes cellulase, an enzyme that hydrolyzes (b1n4) linkages between glucose units. Wood-rot fungi and bacteria also produce cellulase. The only vertebrates able to use cellulose as food are cattle and other ruminants (sheep, goats, camels, giraffes). The extra stomach compartment (rumen) of a ruminant teems with bacteria and protists that secrete cellulase. Chitin is a linear homopolysaccharide composed of N-acetylglucosamine residues in b linkage (Fig. 9–18). The only chemical difference from cellu- lose is the replacement of the hydroxyl group at C-2 with an acetylated amino group. Chitin forms extended fibers similar to those of cellulose, and like cellulose is indigestible by vertebrate animals. Chitin is the principal component of the hard exoskeletons of nearly a million species of arthro- pods—insects, lobsters, and crabs, for example—and is probably the sec- figure 9–18 ond most abundant polysaccharide, next to cellulose, in nature. A short segment of chitin, a homopolymer of N-acetyl-D- glucosamine units in (b1n4) linkage.

CH3 CH3 A A CPO CPO A A 6 CH2OH H NH CH2OH HNH O 3 2 O 5 O O H H OH H H H H OH H H 4 1 41 OH H H OH H H H H H H O O O O 32 5 H NH CH2OH H NH CH2OH A 6 A CPO CPO A A CH3 CH3

Bacterial Cell Walls Contain Peptidoglycans The rigid component of bacterial cell walls is a heteropolymer of alternating (b1n4)-linked N-acetylglucosamine and N-acetylmuramic acid residues (Fig. 9–19). The linear polymers lie side by side in the cell wall, cross-linked 308 Part II Structure and Catalysis figure 9–19 Peptidoglycan. This is the peptidoglycan of the cell wall of Staphylococcus aureus, a gram-positive bacterium. Peptides (strings of colored spheres) covalently link Staphylococcus N-acetylmuramic acid residues in neighboring polysac- aureus charide chains. Note the mixture of L and D amino acids in the peptides. Gram-positive bacteria such as S. aureus have a pentaglycine chain in the cross-link. Gram- negative bacteria such as E. coli lack the pentaglycine; instead, the terminal D-Ala residue of one tetrapeptide is attached directly to a neighboring tetrapeptide through either L-Lys or a lysine-like amino acid, diaminopimelic acid.

N-Acetylmuramic acid (Mur2Ac)

(b14) N-Acetylglucosamine Site of cleavage (GlcNAc) by lysozyme

Reducing end

L-Ala D-Glu L-Lys D-Ala Pentaglycine cross-link

by short peptides, the exact structure of which depends on the bacterial species. The peptide cross-links weld the polysaccharide chains into a strong sheath that envelops the entire cell and prevents cellular swelling and lysis due to the osmotic entry of water. The enzyme lysozyme kills bac- teria by hydrolyzing the (b1n4) glycosidic bond between N-acetylglu- cosamine and N-acetylmuramic acid. Lysozyme is notably present in tears, presumably as a defense against bacterial infections of the eye. It is also produced by certain bacterial viruses to ensure their release from the host bacterial cell, an essential step of the viral infection cycle. Penicillin and re- lated antibiotics kill bacteria by preventing the synthesis of cross-links, leaving the cell wall too weak to resist osmotic lysis (see Box 20–1).

Glycosaminoglycans Are Components of the Extracellular Matrix The extracellular space in the tissues of multicellular animals is filled with a gel-like material, the extracellular matrix, also called ground sub- stance, which holds the cells together and provides a porous pathway for the diffusion of nutrients and oxygen to individual cells. The extracellular matrix is composed of an interlocking meshwork of heteropolysaccharides and fibrous proteins such as collagen, elastin, fibronectin, and laminin. The heteropolysaccharides, called , are a family of linear polymers composed of repeating disaccharide units (Fig. 9–20). One of the two monosaccharides is always either N-acetylglucosamine or N-acetyl- Chapter 9 Carbohydrates and Glycobiology 309

galactosamine; the other is in most cases a uronic acid, usually D-glucuronic or L-iduronic acid. In some glycosaminoglycans, one or more of the hydrox- yls of the is esterified with sulfate. The combination of sulfate groups and the carboxylate groups of the uronic acid residues gives gly- cosaminoglycans a very high density of negative charge. To minimize the re- pulsive forces among neighboring charged groups, these molecules assume an extended conformation in solution. The specific patterns of sulfated and nonsulfated sugar residues in glycosaminoglycans provide for specific recognition by a variety of protein ligands that bind electrostatically to these molecules. Glycosaminoglycans are attached to extracellular proteins to form proteoglycans (discussed below). The hyaluronic acid (hyaluronate at physiological pH) contains alternating residues of D-glucuronic acid and N-acetylglu- cosamine (Fig. 9–20). With up to 50,000 repeats of the basic disaccharide unit, hyaluronates have molecular weights greater than 1 million; they form clear, highly viscous solutions that serve as lubricants in the synovial fluid of joints and give the vitreous humor of the vertebrate eye its jellylike con- sistency (the Greek hyalos means “glass”; hyaluronates can have a glassy

Glycosaminoglycan Repeating disaccharide Number of disaccharides per chain

CH2OH O H H O COO H O HO H O Hyaluronate H H ( 1n4) 50,000 H NH OH H A H CPO (1n3) A H OH CH3 GlcA GlcNAc

CH2OH O O SO 3 H O COO H O H H 20–60 Chondroitin O H H ( 1n4) 4-sulfate H NH OH H A H CPO (1n3) A H OH CH3 GlcA GalNAc4SO3

CH2OSO3 O CH2OH O H H O 25 Keratan HO H OH H sulfate O H H (1n3) figure 9–20 H H H NH Repeating units of some common glycosaminoglycans (1n4) A of extracellular matrix. The molecules are copolymers of H OH CPO alternating uronic acid and amino sugar residues, with A sulfate esters in any of several positions. The ionized car- CH 3 boxylate and sulfate groups (red) give these polymers Gal GlcNAc6SO 3 their characteristic highly negative charge. 310 Part II Structure and Catalysis

or translucent appearance). Hyaluronate is also an essential component of the extracellular matrix of cartilage and tendons, to which it contributes tensile strength and elasticity as a result of strong interactions with other components of the matrix. Hyaluronidase, an enzyme secreted by some pathogenic (disease-causing) bacteria, can hydrolyze the glycosidic link- ages of hyaluronate, rendering tissues more susceptible to bacterial inva- sion. In many organisms, a similar enzyme in sperm hydrolyzes an outer gly- cosaminoglycan coat around the ovum, allowing sperm penetration. Other glycosaminoglycans differ from hyaluronate in two respects: they are generally much shorter polymers and they are covalently linked to spe- cific proteins (proteoglycans, discussed below). Chondroitin sulfate (Greek chondros, “cartilage”) contributes to the tensile strength of cartilage, ten- dons, ligaments, and the walls of the aorta. Dermatan sulfate (Greek derma, “skin”) contributes to the pliability of skin and is also present in blood vessels and heart valves. Keratan sulfates (Greek keras, “horn”) have no uronic acid and their sulfate content is variable. They are present in cornea, cartilage, bone, and a variety of horny structures formed of dead cells: horn, hair, hoofs, nails, and claws. Heparin (Greek he¯par, “liver”) is a natural anticoagulant made in mast cells and released into the blood, where it inhibits coagulation by binding to and stimulating the anticoagulant pro- tein antithrombin III. The interaction is strongly electrostatic; heparin has the highest negative charge density of any known biological macromolecule. Purified heparin is routinely added to blood samples obtained for clinical analysis, and to blood donated for transfusion, to prevent clotting. Table 9–2 summarizes the composition, properties, roles, and occur- rence of the polysaccharides described in this section.

table 9–2 Structures and Roles of Some Polysaccharides Size (number of Polymer Type* Repeating unit† monosaccharide units) Roles Starch Energy storage: in plants Amylose Homo- (a1n4)Glc, linear 50–5,000 Amylopectin Homo- (a1n4)Glc, with Up to 106 (a1n6)Glc branches every 24 to 30 residues Glycogen Homo- (a1n4)Glc, with Up to 50,000 Energy storage: in bacteria and animal cells (a1n6)Glc branches every 8 to 12 residues Cellulose Homo- (b1n4)Glc Up to 15,000 Structural: in plants, gives rigidity and strength to cell walls Chitin Homo- (b1n4)GlcNAc Very large Structural: in insects, spiders, crustaceans, gives rigidity and strength to exoskeletons Peptidoglycan Hetero-; 4)Mur2Ac(b1n4) Very large Structural: in bacteria, gives rigidity and strength peptides GlcNAc(b1 to cell envelope attached Hyaluronate (a Hetero-; 4)GlcA(b1n3) Up to 100,000 Structural: in vertebrates, extracellular matrix of glycosamino- acidic GlcNAc(b1 skin and connective tissue; viscosity and ) lubrication in joints

* Each polymer is classified as a homopolysaccharide (homo-) or heteropolysaccharide (hetero-). †The abbreviated names for the peptidoglycan and hyaluronate repeating units indicate that the polymer contains repeats of this disaccharide unit, with the GlcNAc of one disaccharide unit linked b(1n4) to the first residue of the next disaccharide unit. Chapter 9 Carbohydrates and Glycobiology 311

Glycoconjugates: Proteoglycans, Glycoproteins, and Glycolipids In addition to their important roles as stored fuels (starch, glycogen) and as structural materials (cellulose, chitin, peptidoglycan), polysaccharides (and oligosaccharides) are information carriers: they serve as destination labels for some proteins and as mediators of specific cell-cell interactions and in- teractions between cells and the extracellular matrix. Specific carbohy- drate-containing molecules act in cell-cell recognition and adhesion, cell mi- gration during development, blood clotting, the immune response, and wound healing, to name but a few of their many roles. In most of these cases, the informational carbohydrate is covalently joined to a protein or a lipid to form a glycoconjugate, which is the biologically active molecule. Proteoglycans are macromolecules of the cell surface or extracellular matrix in which one or more glycosaminoglycan chains are joined covalently to a membrane protein or a secreted protein. The glycosaminoglycan moi- ety commonly forms the greater fraction (by mass) of the proteoglycan molecule, dominates the structure, and is often the main site of biological activity. In many cases the biological activity is the provision of multiple binding sites, rich in opportunities for hydrogen bonding and electrostatic interactions with other proteins of the cell surface or the extracellular ma- trix. Proteoglycans are major components of connective tissue such as car- tilage, in which their many noncovalent interactions with other proteogly- cans, proteins, and glycosaminoglycans provide strength and resilience. Glycoproteins have one or several oligosaccharides of varying com- plexity joined covalently to a protein. They are found on the outer face of the plasma membrane, in the extracellular matrix, and in the blood. Inside cells they are found in specific organelles such as Golgi complexes, secre- tory granules, and lysosomes. The oligosaccharide portions of glycoproteins are less monotonous than the glycosaminoglycan chains of proteoglycans; they are rich in information, forming highly specific sites for recognition and high-affinity binding by other proteins. Glycolipids are membrane lipids in which the hydrophilic head groups are oligosaccharides, which, as in glycoproteins, act as specific sites for recognition by carbohydrate-binding proteins. Glycobiology, the study of the structure and function of glycoconju- gates, is one of the most active and exciting areas of biochemistry and cell biology. Our discussion uses just a few examples to illustrate the diversity of structure and the range of biological activity of the glycoconjugates. In Chapter 20 we discuss the biosynthesis of polysaccharides, including the glycosaminoglycans, and in Chapter 27, the assembly of oligosaccharide chains on glycoproteins.

Proteoglycans Are Glycosaminoglycan-Containing Macromolecules of the Cell Surface and Extracellular Matrix 1 3 1 4 1 3 1 3 1 4 The basic proteoglycan unit consists of a “core protein” with covalently at- GlcA GalNAc GlcA Gal Gal Xyl Ser tached glycosaminoglycan(s). For example, the sheetlike extracellular ma- n Chondroitin sulfate trix (basal lamina) that separates organized groups of cells contains a fam- Core protein ily of core proteins (Mr 20,000 to 40,000), each with several covalently attached heparan sulfate chains. (Heparan sulfate is structurally similar to heparin but has a lower density of sulfate esters.) The point of attachment figure 9–21 is commonly a Ser residue, to which the glycosaminoglycan is joined Proteoglycan structure, showing the through a trisaccharide bridge (Fig. 9–21). The Ser residue is generally in bridge. A typical trisaccharide linker (blue) connects a glycosaminoglycan—in this case chondroitin sulfate the sequence –Ser–Gly–X–Gly– (where X is any amino acid residue), al- (orange)—to a Ser residue (red) in the core protein. The though not every protein with this sequence has an attached glycosamino- xylose residue at the reducing end of the linker is joined glycan. Many proteoglycans are secreted into the extracellular matrix, but by its anomeric carbon to the hydroxyl of the Ser residue. 312 Part II Structure and Catalysis

+ Heparan NH3 some are integral membrane proteins. For example, syndecan core protein sulfate (Mr 56,000) has a single transmembrane domain and an extracellular do- main bearing three chains of heparan sulfate and two of chondroitin sulfate, each attached to a Ser residue (Fig. 9–22). The heparan sulfate moieties Chondroitin bind a variety of extracellular ligands and thereby modulate the ligands’ in- sulfate teraction with specific receptors of the cell surface. Some proteoglycans can form proteoglycan aggregates, enormous supramolecular assemblies of many core proteins all bound to a single mol-

ecule of hyaluronate. Aggrecan core protein (Mr 250,000) has multiple chains of chondroitin sulfate and keratan sulfate, joined to Ser residues in Outside the core protein through trisaccharide linkers, to give an aggrecan 6 monomer of Mr 2 10 . When a hundred or more of these “decorated” core proteins bind a single, extended molecule of hyaluronate (Fig. 9–23), 8 the resulting proteoglycan aggregate (Mr 2 10 ) and its associated wa- ter of hydration occupy a volume about equal to that of a bacterial cell! Ag- grecan interacts strongly with collagen in the extracellular matrix of carti- lage, contributing to its development and tensile strength. Inside Interwoven with these enormous extracellular proteoglycans are fi- brous matrix proteins such as collagen, elastin, and fibronectin, forming a –OOC cross-linked meshwork that gives the whole extracellular matrix strength and resilience. Some of these proteins are multiadhesive, a single protein figure 9–22 having binding sites for several different matrix molecules. Fibronectin, for Proteoglycan structure of an integral membrane protein. This schematic diagram shows syndecan, a core protein example, has separate domains that bind fibrin, heparan sulfate, collagen, of the plasma membrane. The amino-terminal domain and a family of plasma membrane proteins called integrins that mediate sig- on the extracellular side of the membrane is covalently naling between the cell interior and the extracellular matrix (see Fig. 12– attached (by trisaccharide linkers such as those in 19). Integrins, in turn, have binding sites for a number of other extracellu- Fig. 9–21) to three heparan sulfate chains and two chon- lar macromolecules. The picture of cell-matrix interactions that emerges droitin sulfate chains. (Fig. 9–24) shows an array of interactions between cellular and extracellu- lar molecules. These interactions serve not merely to anchor cells to the ex- tracellular matrix but also to provide paths that direct the migration of cells

Hyaluronate (up to 50,000 repeating disaccharides) figure 9–23 A proteoglycan aggregate of the extracellular matrix. Keratan One very long molecule of hyaluronate is associated non- sulfate covalently with about 100 molecules of the core protein Chondroitin aggrecan. Each aggrecan molecule contains many cova- sulfate lently bound chondroitin sulfate and keratan sulfate Link proteins chains. Link proteins situated at the junction between each core protein and the hyaluronate backbone mediate Aggrecan the core protein–hyaluronate interaction. core protein Chapter 9 Carbohydrates and Glycobiology 313 in developing tissue and, through integrins, to convey information in both Actin filaments directions across the plasma membrane. Matrix proteoglycans are essential in the response of cells to certain ex- Integrin tracellular growth factors. For example, fibroblast growth factor (FGF), an extracellular protein signal that stimulates cell division, first binds to he- paran sulfate moieties of syndecan molecules in the target cell’s plasma membrane. Syndecan then “presents” FGF to the specific FGF plasma membrane receptor, and only then can FGF interact productively with its Proteoglycan receptor to trigger cell division.

Glycoproteins Are Information-Rich Conjugates Containing Oligosaccharides Fibronectin Glycoproteins are carbohydrate-protein conjugates in which the carbohy- drate moieties are smaller and more structurally diverse than the gly- cosaminoglycans of proteoglycans. The carbohydrate is attached at its anomeric carbon through a glycosidic link to the XOH of a Ser or Thr residue (O-linked), or through an N-glycosyl link to the amide nitrogen of an Asn residue (N-linked) (Fig. 9–25). Some glycoproteins have a single Cross-linked oligosaccharide chain, but many have more than one; the carbohydrate may fibers of collagen constitute from 1% to 70% or more of the by mass. The struc- tures of a large number of O- and N-linked oligosaccharides from a variety of glycoproteins are known; Figure 9–25 shows a few typical examples. As we will see in Chapter 12, the external surface of the plasma mem- Plasma membrane brane has many membrane glycoproteins with arrays of covalently attached figure 9–24 oligosaccharides of varying complexity. One of the best-characterized mem- Interactions between cells and extracellular matrix. The brane glycoproteins is glycophorin A of the erythrocyte membrane (see Fig. association between cells and the proteoglycan of extra- 12–10). It contains 60% carbohydrate by mass in the form of 16 oligosac- cellular matrix is mediated by a membrane protein (inte- charide chains (totalling 60 to 70 monosaccharide residues) covalently at- grin) and by an extracellular protein (fibronectin in this tached to amino acid residues near the amino terminus of the polypeptide example) with binding sites for both integrin and the pro- teoglycan. Note the close association of collagen fibers chain. Fifteen of the oligosaccharide chains are O-linked to Ser or Thr with the fibronectin and proteoglycan. residues, and one is N-linked to an Asn residue.

(a) O-linked (b) N-linked

O

CH2 HOCH2 O C O O O HO H H NH C CH CH H C O H 2 OH H OH H NH H O CH2 CH O H NH H NH H NH C O C O figure 9–25 Oligosaccharide linkages in glycoproteins. (a) O-linked CH3 CH3 GalNAcSer GlcNAc Asn oligosaccharides have a glycosidic bond to the hydroxyl group of Ser or Thr residues (shaded pink), illustrated Examples: Examples: here with GalNAc as the sugar at the reducing end of the oligosaccharide. One simple chain and one complex chain are shown. (Recall that Neu5Ac—N-acetylneuraminic Glc acid—is commonly called sialic acid.) (b) N-linked

Ser/Thr GlcNAc Asn Asn oligosaccharides have an N-glycosyl bond to the amide Man nitrogen of an Asn residue (shaded green), illustrated here Gal with GlcNAc as the terminal sugar. Three common types Neu5Ac of oligosaccharide chains that are N-linked in glycopro- Fuc teins are shown. A complete description of oligosaccharide GalNAc Asn structure requires specification of the position and stereo- Ser/Thr chemistry (a or b) of each glycosidic linkage. 314 Part II Structure and Catalysis

Many of the proteins secreted by eukaryotic cells are glycoproteins, including most of the proteins of blood. For example, immunoglobulins (antibodies) and certain hormones, such as follicle-stimulating hormone, luteinizing hormone, and thyroid-stimulating hormone, are glycoproteins. Many milk proteins, including lactalbumin, and some of the proteins se- creted by the pancreas (e.g., ribonuclease) are glycosylated, as are most of the proteins contained in lysosomes. The biological advantages of adding oligosaccharides to proteins are not fully understood. The very hydrophilic clusters of carbohydrate alter the polarity and solubility of the proteins with which they are conjugated. Oligosaccharide chains attached to newly synthesized proteins in the Golgi complex may also influence the sequence of polypeptide-folding events that determine the tertiary structure of the protein (see Fig. 27–36); steric in- teractions between peptide and oligosaccharide may preclude one folding route and favor another. When numerous negatively charged oligosaccha- ride chains are clustered in a single region of a protein, the charge repulsion among them favors the formation of an extended, rodlike structure in that region. The bulkiness and negative charge of oligosaccharide chains also protect some proteins from attack by proteolytic enzymes. Beyond these global physical effects on protein structure, there are more specific biological effects of oligosaccharide chains in glycoproteins. We noted earlier the difference between the information-rich linear se- quences of nucleic acids and proteins and the monotonous regularity of ho- mopolysaccharides such as cellulose (see Fig. 3–23). The oligosaccharides attached to glycoproteins are generally not monotonous but enormously rich in structural information. Consider the oligosaccharide chains in Figure 9–25, typical of those found in many glycoproteins. The most complex of those shown contains 14 monosaccharide residues of four different kinds, variously linked as (1n2), (1n3), (1n4), (1n6), (2n3), and (2n6), some with the a and some with the b configuration. The number of possible per- mutations and combinations of monosaccharide types and glycosidic link- ages in an oligosaccharide of this size is astronomical. Each of the oligosac- charides in Figure 9–25 therefore presents a unique face, recognizable by the enzymes and receptors that interact with it. A number of cases are known in which the same protein produced in two tissues has different gly- cosylation patterns. The human protein interferon IFN-b1 produced in ovarian cells, for example, has oligosaccharide chains that differ from those of the same protein produced in breast epithelial cells. The biological sig- nificance of these tissue glycoforms is not understood, but in some way the oligosaccharide chains represent a tissue-specific mark.

Glycolipids and Lipopolysaccharides Are Membrane Components Glycoproteins are not the only cellular components that bear complex oligosaccharide chains; some lipids, too, have covalently bound oligosac- charides. Gangliosides are membrane lipids of eukaryotic cells in which the polar head group, the part of the lipid that forms the outer surface of the membrane, is a complex oligosaccharide containing sialic acid (Fig. 9– 9) and other monosaccharide residues. Some of the oligosaccharide moi- eties of gangliosides, such as those that determine human blood groups (see Fig. 11–12), are identical with those found in certain glycoproteins, which therefore also contribute to blood group type determination. Like the oligosaccharide moieties of glycoproteins, those of membrane lipids are generally, perhaps always, found on the outer face of the plasma membrane. Lipopolysaccharides are the dominant surface feature of the outer membrane of gram-negative bacteria such as E. coli and Salmonella ty- phimurium. They are prime targets of the antibodies produced by the ver- Chapter 9 Carbohydrates and Glycobiology 315

n10 O-Specific chain GlcNAc Man Glc Gal AbeOAc Rha Kdo Core Hep

O O O O O O O O HO O P O O P HO NH HN OH O O O O O

O O OH OH O O (b) Lipid A figure 9–26 Bacterial lipopolysaccharides. (a) Schematic diagram of the lipopolysaccharide of the outer membrane of Salmo- nella typhimurium. Kdo is 3-deoxy-D-manno-octonic acid, also called ketodeoxyoctulosonic acid; Hep is L-glycero- D-mannoheptose; AbeOAc is abequose (a 3,6- dideoxyhexose) acetylated on one of its hydroxyls. There are six fatty acids in the lipid A portion of the molecule. Different bacterial species have subtly different lipopolysaccharide structures, but they have in common a lipid region (lipid A), a core oligosaccharide, and an “O-specific” chain, which is the principal determinant of (a) the serotype (immunological reactivity) of the bacterium. The outer membrane of the gram-negative bacteria tebrate immune system in response to bacterial infection and are therefore S. typhimurium and E. coli contains so many lipopolysac- important determinants of the serotype of bacterial strains (serotypes are charide molecules that the cell surface is virtually covered with O-specific chains. (b) Space-filling molecular model strains that are distinguished on the basis of antigenic properties). The of the lipopolysaccharide from E. coli. lipopolysaccharides of S. typhimurium contain six fatty acids bound to two glucosamine residues, one of which is the point of attachment for a complex oligosaccharide (Fig. 9–26). E. coli has similar but unique lipopolysaccharides. The lipopolysaccharides of some bacteria are toxic to humans and other animals; for example, they are responsible for the dan- gerously lowered blood pressure that occurs in toxic shock syndrome re- sulting from gram-negative bacterial infections.

Oligosaccharide-Lectin Interactions Mediate Many Biological Processes Lectins, found in all organisms, are proteins that bind carbohydrates with high affinity and specificity. A number of critically situated hydrogen-bonding partners in the carbohydrate recognition domain of each type of protein bind to specific oligosaccharides, and thus lectins easily distinguish be- tween closely similar sugars (Table 9–3). Lectins serve in a wide variety of cell-cell recognition and adhesion processes. Here we discuss just a few ex- amples of the roles of lectins in nature. In the laboratory, purified lectins are useful reagents for detecting and separating glycoproteins with different oligosaccharide moieties. 316 Part II Structure and Catalysis

table 9–3 Lectins and the Oligosaccharide Ligands That They Bind Lectin family and lectin Abbreviation Ligand(s) Plant

Concanavalin A ConA Mana1XOCH3 Griffonia simplicifolia GS4 Lewis b (Leb) lectin 4 Wheat germ agglutinin WGA Neu5Ac(a2n3)Gal(b1n4)Glc GlcNAc(b1n4)GlcNAc Ricin Gal(b1n4)Glc Animal Galectin-1 Gal(b1n4)Glc Mannose-binding protein A MBP-A High-mannose octasaccharide Viral Influenza virus hemagglutinin HA Neu5Ac(a2n6)Gal(b1n4)Glc Polyoma virus protein 1 VP1 Neu5Ac(a2n3)Gal(b1n4)Glc Bacterial Enterotoxin LT Gal Cholera toxin CT GM1 pentasaccharide

Source: Weiss, W.I. & Drickamer, K. (1996) Structural basis of lectin-carbohydrate recognition. Annu. Rev. Biochem. 65, 441–473.

The sialic acid (Neu5Ac) residues situated at the ends of the oligosac- charide chains of many plasma glycoproteins (Fig. 9–25) protect the pro- teins from uptake and degradation in the liver. For example, ceruloplasmin, a copper-transporting glycoprotein, has several oligosaccharide chains end- ing in sialic acid. Removal of sialic acid residues by the enzyme sialidase is one way in which the body marks “old” proteins for destruction and re- placement. The plasma membrane of hepatocytes has lectin molecules (asialoglycoprotein receptors; “asialo-” indicating “without sialic acid”) that specifically bind oligosaccharide chains with galactose residues no longer “protected” by a terminal sialic acid residue. Receptor-ceruloplasmin inter- action triggers endocytosis and destruction of the ceruloplasmin. A similar mechanism is apparently responsible for removing old ery- throcytes from the mammalian bloodstream. Newly synthesized erythro- cytes have several membrane glycoproteins with oligosaccharide chains that end in sialic acid. When sialic acid residues are removed by withdraw- ing a sample of blood, treating it with sialidase in vitro, and reintroducing it into the circulation, the treated erythrocytes disappear from the blood- stream within a few hours, whereas those with intact oligosaccharides (ery- throcytes withdrawn and reintroduced without sialidase treatment) con- tinue to circulate for days. Some peptide hormones that circulate in the blood have oligosaccha- ride moieties that strongly influence their circulatory lifetime. Luteinizing hormone and thyrotropin (polypeptide hormones produced in the adrenal cortex) have N-linked oligosaccharides that end with the disaccharide GalNAc4SO3 (b1n4)GlcNAc, which is recognized by a lectin (receptor) of hepatocytes. (GalNAc4SO3 is N-acetylgalactosamine sulfated on the XOH group of C-4.) Receptor-hormone interaction mediates the uptake and de- struction of luteinizing hormone and thyrotropin, reducing their concentra- Glycoprotein ligand Glycoprotein ligand for integrin for P-selectin tion in the blood. Thus the blood levels of these hormones undergo a peri- Integrin odic rise (due to secretion by the adrenal cortex) and fall (due to destruc- P-selectin tion by hepatocytes). Free Selectins are a family of lectins, found in plasma membranes, that me- Capillary T lymphocyte diate cell-cell recognition and adhesion in a wide range of cellular endothelial processes. One such process is the movement of immune cells (T lympho- cell cytes) through the capillary wall, from blood to tissues, at sites of infection Rolling or inflammation (Fig. 9–27). At an infection site, P-selectin on the surface of capillary endothelial cells interacts with a specific oligosaccharide of the glycoproteins of circulating T lymphocytes. This interaction slows the T cells as they adhere to and roll along the endothelial lining of the capillar- Adhesion ies. A second interaction, between integrin molecules in the T-cell plasma membrane and an adhesion protein on the endothelial cell surface, now stops the T cell and allows it to move through the capillary wall into the in- Site of fected tissues to initiate the immune attack. Two other selectins participate inflammation in this process: E-selectin on the endothelial cell and L-selectin on the T cell bind their cognate oligosaccharides on the other cell. Extravasation Some microbial pathogens have lectins that mediate bacterial adhesion to host cells or toxin entry into cells. The bacterium believed responsible for Blood most gastric ulcers, Helicobacter pylori, adheres to the inner surface of the flow stomach by interactions between bacterial membrane lectins and specific oligosaccharides of membrane glycoproteins of the gastric epithelial cells (Fig. 9–28). Among the binding sites recognized by H. pylori is the figure 9–27 b oligosaccharide Le when it is a part of the type O blood group determinant. Role of lectin-ligand interactions in lymphocyte move- This observation helps to explain the severalfold greater incidence of gas- ment to the site of an infection or injury. A T lympho- tric ulcers in people of blood type O than those of type A or B. Chemically cyte circulating through a capillary is slowed by transient synthesized analogs of the Leb oligosaccharide may prove useful in treating interactions between P-selectin molecules in the plasma membrane of capillary endothelial cells and glycoprotein this type of ulcer. Administered orally, they could prevent bacterial adhe- ligands for P-selectin on the T-cell surface. As it interacts sion (and thus infection) by competing with the gastric glycoproteins for with successive P-selectin molecules, the T cell rolls binding to the bacterial lectin. along the capillary surface. Near a site of inflammation, The cholera toxin molecule (produced by Vibrio cholerae) triggers di- stronger interactions between integrin in the capillary arrhea after entering intestinal cells responsible for water absorption from surface and its ligand in the T-cell surface lead to tight adhesion. The T-cell stops rolling and, under the influ- the intestine. It attaches to its target cell through the oligosaccharide of ence of signals sent out from the site of the inflammation, ganglioside GM1, a membrane phospholipid (see Box 11–2, Fig. 1), in the begins extravasation—escape through the capillary wall— surface of intestinal epithelial cells. Similarly, the pertussis toxin produced as it moves toward the site of inflammation. by Bordetella pertussis, the bacterium that causes whooping cough, enters target cells only after interacting with an oligosaccharide (or perhaps sev- eral oligosaccharides) with a terminal sialic acid residue. Understanding the details of the oligosaccharide-binding sites of these toxins (lectins) may al- low the development of genetically engineered toxin analogs for use in vac- cines. Toxin analogs engineered to lack the carbohydrate binding site would be harmless because they could not bind to and enter cells, but they might elicit an immune response that would protect the recipient if exposed to the natural toxin. Several animal viruses, including the influenza virus, attach to their host cells through interactions with oligosaccharides displayed on the host cell surface. The lectin of the influenza virus is the HA protein, which we will describe in Chapter 12. The HA protein is essential for viral entry and infection (see pp. 405–406 and Fig. 12–21a).

figure 9–28 Helicobacter pylori cells adhering to the gastric surface. This bacterium causes ulcers by interactions between a bacterial surface lectin and the Leb oligosaccharide (a blood group antigen) of the gastric epithelium.

317 318 Part II Structure and Catalysis

Virus Bacterium Lymphocyte

Oligosaccharide chain

Plasma membrane protein Toxin

P-selectin

Glycolipid

(c)

(d) (b) (e) (a) figure 9–29 Roles of oligosaccharides in recognition and adhesion Figure 9–29 summarizes the types of biological interactions that are at the cell surface. (a) Oligosaccharides with unique mediated by carbohydrate binding. structures (represented as strings of colored balls), com- ponents of a variety of glycoproteins or glycolipids on the outer surface of plasma membranes, interact with high specificity and affinity with lectins in the extracellular Analysis of Carbohydrates milieu. (b) Viruses that infect animal cells, such as the The growing appreciation of the importance of oligosaccharide structure influenza virus, bind to glycoproteins on the cell as the in biological recognition has been the driving force behind the development first step of infection. (c) Bacterial toxins, such as the cholera and pertussis toxins, bind to a surface glycolipid of methods for analyzing the structure and stereochemistry of complex before entering a cell. (d) Some bacteria, such as Heli- oligosaccharides. Oligosaccharide analysis is complicated by the fact that, cobacter pylori, adhere to then colonize or infect animal unlike nucleic acids and proteins, oligosaccharides can be branched and are cells. (e) Lectins called selectins in the plasma mem- joined by a variety of linkages. For such analysis, oligosaccharides are gen- brane of certain cells mediate cell-cell interactions, such erally removed from their protein or lipid conjugates, then subjected to as those of T lymphocytes with the endothelial cells of the capillary wall at an infection site. stepwise degradation with specific reagents that reveal bond position or stereochemistry. Mass spectrometry and NMR spectroscopy have also be- come invaluable in deciphering oligosaccharide structure. The oligosaccharide moieties of glycoproteins or glycolipids can be re- leased by purified enzymes—glycosidases that specifically cleave O- or N- linked oligosaccharides or lipases that remove lipid head groups. Mixtures of carbohydrates are resolved into their individual components (Fig. 9–30) by some of the same techniques useful in protein and amino acid separa- tion: fractional precipitation by solvents, and ion-exchange and gel filtration (size-exclusion) chromatography (see Fig. 5–18). Highly purified lectins, attached covalently to an insoluble support, are commonly used in affinity chromatography of carbohydrates (see Fig. 5–18c). Hydrolysis of oligosac- charides and polysaccharides in strong acid yields a mixture of monosac- charides, which, after conversion to suitable volatile derivatives, may be separated, identified, and quantified by gas-liquid chromatography (see p. 385) to yield the overall composition of the polymer. For simple, linear poly- mers such as amylose, the positions of the glycosidic bonds are determined Chapter 9 Carbohydrates and Glycobiology 319

Glycoprotein or glycolipid

Release oligosaccharides with endoglycosidase

Oligosaccharide mixture

1) Ion-exchange chromatography 2) Gel filtration 3) Lectin affinity chromatography

Purified Separated polysaccharide oligosaccharides

Exhaustive methylation Enzymatic hydrolysis NMR and Hydrolysis with with CH3I, with specific mass strong acid strong base glycosidases spectrometry

Monosaccharides Fully methylated Smaller carbohydrate oligosaccharides

Resolution of fragments High-performance Acid hydrolysis yields in mixture liquid chromatography or monosaccharides derivatization methylated at every Each oligosaccharide and gas-liquid —OH except those involved subjected to methylation chromatography in glycosidic bonds or enzymatic analysis

Composition of Sequence of Sequence of mixture Position(s) of monosaccharides; monosaccharides; Types and amounts glycosidic position and position and of monosaccharide bonds configuration of configuration of units glycosidic bonds glycosidic bonds

figure 9–30 Methods of carbohydrate analysis. A carbohydrate puri- by treating the intact polysaccharide with methyl iodide in a strongly basic fied in the first stage of the analysis often requires all four medium to convert all free hydroxyls to acid-stable methyl ethers, then hy- analytical routes for its complete characterization. drolyzing the methylated polysaccharide in acid. The only free hydroxyls present in the monosaccharide derivatives produced are those that were in- volved in glycosidic bonds. To determine the sequence of monosaccharide residues, including branches if they are present, exoglycosidases of known specificity are used to remove residues one at a time from the nonreducing end(s). The specificity of these exoglycosidases often allows deduction of the position and stereochemistry of the linkages. Oligosaccharide analysis relies increasingly on mass spectrometry and high-resolution NMR spectroscopy. NMR analysis alone, especially for oligo- saccharides of moderate size, can yield much information about sequence, linkage position, and anomeric carbon configuration. Polysaccharides and large oligosaccharides can be treated chemically or with endoglycosidases 320 Part II Structure and Catalysis

to split specific internal glycosidic bonds, producing several smaller, more easily analyzable oligo-saccharides. Automated procedures and commercial instruments are used for the routine determination of oligosaccharide structure, but the sequencing of branched oligosaccharides joined by more than one type of bond remains a far more formidable problem than deter- mining the linear sequences of proteins and nucleic acids, which have monomers joined by a single bond type.

summary

Carbohydrates are predominantly cyclized poly- amino sugar and uronic acid units. These include hydroxy aldehydes or ketones, which occur in na- hyaluronate, a high molecular weight polymer ture as monosaccharides (aldoses or ketoses), of alternating D-glucuronic acid and N-acetyl-D- oligosaccharides (several monosaccharide units), glucosamine residues, and a variety of shorter, and polysaccharides (large linear or branched sulfated, very acidic heteropolysaccharides such molecules containing many monosaccharide as heparan sulfate, chondroitin sulfate, keratan units). Monosaccharides have at least one asym- sulfate, and dermatan sulfate. metric carbon atom and thus exist in stereoiso- Many of the oligosaccharides in a cell are in meric forms. Most common, naturally occurring glycoconjugates—hybrid molecules in which the sugars, such as ribose, glucose, fructose, and carbohydrate moiety is covalently bound to a mannose, are of the D series. Simple sugars hav- protein or lipid. Proteoglycans, proteins (such as ing four or more carbon atoms may exist in the syndecan and aggrecan core proteins) with one form of closed-ring hemiacetals or hemiketals, ei- or more covalently attached glycosaminoglycan ther furanoses (five-membered ring) or pyra- chains, are generally found on the outer surface noses (six-membered ring). Furanoses and pyra- of cells or in the extracellular matrix. In most tis- noses occur in anomeric a and b forms, which are sues, the extracellular matrix contains a variety interconverted in the process of mutarotation. of proteoglycans and multiadhesive proteins Sugars with free, oxidizable anomeric carbons are such as fibronectin, as well as huge proteoglycan called reducing sugars. Many derivatives of the aggregates consisting of many proteoglycans non- simple sugars are found in living cells, including covalently bound to a single large hyaluronate amino sugars and their acetylated derivatives, al- molecule. donic acids, and uronic acids. The hydroxyl Glycoproteins contain covalently linked groups of monosaccharides can also form phos- oligosaccharides that are smaller but more struc- phate and sulfate esters. Disaccharides consist of turally complex, and therefore more informa- two monosaccharides joined by a glycosidic bond. tion-rich, than glycosaminoglycans. Many cell Polysaccharides (glycans) contain many surface or extracellular proteins are glycopro- monosaccharide residues in glycosidic linkage. teins, as are most secreted proteins. The carbo- Some (starch and glycogen) function as storage hydrate moieties serve as biological labels, which forms of carbohydrate, high molecular weight, are “read” by lectins, proteins that bind oligosac- branched polymers of glucose having (a1n4) charide chains with high affinity and selectivity. linkages in the main chains and (a1n6) linkages Interactions between lectins and specific at the branch points. Other polysaccharides play oligosaccharides are central to cell-cell recogni- a structural role in cell walls and exoskeletons. tion and adhesion. In vertebrates, oligosaccha- Cellulose (in plants) has D-glucose units in ride tags govern the rate of degradation of cer- (b1n4) linkage; chitin (in insect exoskeletons) tain peptide hormones, circulating proteins, and is a linear polymer of N-acetylglucosamine joined blood cells. Selectins are plasma membrane by (b1n4) linkages. Bacterial cell walls contain lectins that bind carbohydrate chains in the ex- peptidoglycans, linear polysaccharides of alter- tracellular matrix or on the surfaces of other nating N-acetylmuramic acid and N-acetylglu- cells, thereby mediating the flow of information cosamine residues, cross-linked by short peptide between cell and matrix or between cells. The chains. The extracellular matrix surrounding adhesion of bacterial and viral pathogens (in- cells in animal tissues contains a variety of gly- cluding Helicobacter, the cholera and pertussis cosaminoglycans, linear polymers of alternating toxins, and the influenza virus) to their animal Chapter 9 Carbohydrates and Glycobiology 321

cell targets occurs through binding of lectins in ments for further analysis; methylation analysis to the pathogens to oligosaccharides of the target locate the glycosidic bonds; mass spectrometry, cell surface. stepwise degradation, and NMR spectroscopy to The structure of oligosaccharides and poly- determine sequences; and high-resolution NMR saccharides is investigated by a combination of spectroscopy to establish configurations at methods: specific enzymatic hydrolysis to deter- anomeric carbons. mine stereochemistry and produce smaller frag- further reading

General Background on Carbohydrate Iozzo, R.V. (1998) Matrix proteoglycans: from Chemistry molecular design to cellular function. Annu. Rev. Aspinall, G.O. (ed.) (1982, 1983, 1985) The Poly- Biochem. 67, 609–652. saccharides, Vols 1–3, Academic Press, Inc., New A review focusing on the most recent genetic and York. molecular biological studies of the matrix proteogly- cans. The structure-function relationships of some Collins, P.M. & Ferrier, R.J. (1995) Monosaccha- paradigmatic proteoglycans are discussed in depth rides: Their Chemistry and Their Roles in Natural and novel aspects of their biology are examined. Products, John Wiley & Sons, Chichester, England. Jackson, R.L., Busch, S.J., & Cardin, A.D. (1991) A comprehensive text at the graduate level. Glycosaminoglycans: molecular properties, protein Fukuda, M. & Hindsgaul, O. (1994) Molecular interactions, and role in physiological processes. Glycobiology, IRL Press at Oxford University Press, Physiol. Rev. 71, 481–539. Inc., New York. An advanced review of the chemistry and biology of Thorough, advanced treatment of the chemistry and glycosaminoglycans. biology of cell surface carbohydrates. Good chapters on lectins, carbohydrate recognition in cell-cell inter- Glycoproteins actions, and chemical synthesis of oligosaccharides. Gahmberg, C.G. & Tolvanen, M. (1996) Why Lehmann, J. (Haines, A.H., trans.) (1998) Carbo- mammalian cell surface proteins are glycoproteins. hydrates: Structure and Biology, G. Thieme, Verlag, Trends Biochem. Sci. 21, 308–311. New York. Kobata, A. (1992) Structures and functions of the The fundamentals of carbohydrate chemistry and sugar chains of glycoproteins. Eur. J. Biochem. 209, biology, presented at a level suitable for advanced 483–501. undergraduates and graduate students. Opdenakker, G., Rudd, P., Ponting, C., & Dwek, R. Melendez-Hevia, E., Waddell, T.G., & Shelton, (1993) Concepts and principles of glycobiology. E.D. (1993) Optimization of molecular design in the FASEB J. 7, 1330–1337. evolution of metabolism: the glycogen molecule. A review that considers the genesis of glycoforms, Biochem. J. 295, 477–483. functional roles for glycosylation, and structure- Morrison, R.T. & Boyd, R.N. (1992) Organic function relationships for several glycoproteins. Chemistry, 6th edn, Prentice Hall, Inc., Englewood Varki, A. (1993) Biological roles of oligosaccharides: Cliffs, NJ. all of the theories are correct. Glycobiology 3, 97–130. Chapters 34 and 35 cover the structure, stereochem- istry, nomenclature, and chemical reactions of carbo- Lectins, Recognition, and Adhesion hydrates. Aplin, A.E., Howe, A., Alahari, S.K., & Juliano, Pigman, W. & Horton, D. (eds) (1970, 1972, 1980) R.L. (1998) Signal transduction and signal modula- The Carbohydrates: Chemistry and Biochemistry, tion by cell adhesion receptors: the role of integrins, Vols IA, IB, IIA, and IIB, Academic Press, Inc., New cadherins, immunoglobulin-cell adhesion molecules, York. and selectins. Pharmacol. Rev. 50, 197–263. Comprehensive treatise on carbohydrate chemistry. Borén, T., Normark, S., & Falk, P. (1994) Heli- cobacter pylori: molecular basis for host recognition Glycosaminoglycans and Proteoglycans and bacterial adherence. Trends Microbiol. 2, Carney, S.L. & Muir, H. (1988) The structure and 221–228. function of cartilage proteoglycans. Physiol. Rev. 68, A look at the role of the oligosaccharides that deter- 858–910. mine blood type in the adhesion of this microorgan- An advanced review. ism to the stomach lining, producing ulcers. 322 Part II Structure and Catalysis

Cornejo, C.J., Winn, R.K., & Harlan, J.M. (1997) Analysis of Carbohydrates Anti-adhesion therapy. Adv. Pharmacol. 39, 99–142. Chaplin, M.F. & Kennedy, J.F. (eds) (1994) Car- Analogs of recognition oligosaccharides are used to block adhesion of a pathogen to its host cell target. bohydrate Analysis: A Practical Approach, 2nd edn, IRL Press, Oxford. Hooper, L.A., Manzella, S.M., & Baenziger, J.U. Very useful manual for analysis of all types of sugar- (1996) From legumes to leukocytes: biological roles containing molecules—monosaccharides, polysaccha- for sulfated carbohydrates. FASEB J. 10, 1137–1146. rides and glycosaminoglycans, glycoproteins, proteo- Evidence for roles of sulfated oligosaccharides in glycans, and glycolipids. peptide hormone half-life, symbiont interactions in Dwek, R.A., Edge, C.J., Harvey, D.J., & nitrogen-fixing legumes, and lymphocyte homing. Wormald, M.R. (1993) Analysis of glycoprotein- Horwitz, A.F. (1997) Integrins and health. Sci. Am. associated oligosaccharides. Annu. Rev. Biochem. 276 (May), 68–75. 62, 65–100. Article on the role of integrins in cell-cell adhesion, Excellent survey of the uses of NMR, mass spectrom- and possible roles in arthritis, heart disease, stroke, etry, and enzymatic reagents to determine oligosac- osteoporosis, and the spread of cancer. charide structure. Leahy, D.J. (1997) Implications of atomic-resolution Fukuda, M. & Kobata, A. (1993) Glycobiology: structures for cell adhesion. Annu. Rev. Cell Dev. A Practical Approach, IRL Press, Oxford. Biol. 13, 363–393. A how-to manual for the isolation and characteriza- This review briefly describes several recently deter- tion of the oligosaccharide moieties of glycoproteins, mined structures of cell adhesion molecules, summa- using the whole range of modern techniques. rizes some of the main findings about each structure, and highlights common features of different cell ad- Jay, A. (1996) The methylation reaction in carbohy- hesion systems. drate analysis. J. Carbohydr. Chem. 15, 897–923. Thorough description of methylation analysis of car- McEver, R.P., Moore, K.L., & Cummings, R.D. bohydrates. (1995) Leukocyte trafficking mediated by selectin- carbohydrate interactions J. Biol. Chem. 270, Lennarz, W.J. & Hart, G.W. (eds) (1994) Guide 11,025–11,028. to Techniques in Glycobiology, Methods in Enzy- A short review that focuses on the interaction of mology, Vol. 230, Academic Press, Inc., New York. selectins with their carbohydrate ligands. Practical guide to working with oligosaccharides. Sharon, N. & Lis, H. (1993) Carbohydrates in cell McCleary, B.V. & Matheson, N.K. (1986) Enzymic recognition. Sci. Am. 268 (January), 82–89. analysis of polysaccharide structure. Adv. Carbo- Chemical basis and biological roles of carbohydrate hydr. Chem. Biochem. 44, 147–276. recognition. On the use of purified enzymes in analysis of struc- ture and stereochemistry. Varki, A. (1997) Sialic acids as ligands in recognition phenomena. FASEB J. 11, 248–255. Rudd, P.M., Guile, G.R., Kuester, B., Harvey, D.J., Opdenakker, G., & Dwek, R.A. (1997) Weiss, W.I. & Drickamer, K. (1996) Structural ba- Oligosaccharide sequencing technology. Nature 388, sis of lectin–carbohydrate recognition. Annu. Rev. 205–207. Biochem. 65, 441–473. Good treatment of the chemical basis of carbohydrate- protein interactions.

problems

1. Determination of an Empirical Formula An gen is reduced to a hydroxyl group. For example, unknown substance containing only C, H, and O was D-glyceraldehyde can be reduced to glycerol. How- isolated from goose liver. A 0.423 g sample produced ever, this sugar alcohol is no longer designated D or L.

0.620 g of CO2 and 0.254 g of H2O after complete com- Why? bustion in excess oxygen. Is the empirical formula of this substance consistent with its being a carbohy- 3. Melting Points of Monosaccharide Osazone drate? Explain. Derivatives Many carbohydrates react with phenyl-

2. Sugar Alcohols In the monosaccharide deriva- hydrazine (C6H5NHNH2) to form bright yellow crys- tives known as sugar alcohols, the carbonyl oxy- talline derivatives known as osazones: Chapter 9 Carbohydrates and Glycobiology 323

H O b-D-galactose shows an optical rotation of only 52.8. Moreover, the rotation increases over time to an equilib- C H C NNHC6H5 rium value of 80.2, identical to that reached by

H C OH C6H5NHNH2 C NNHC6H5 a-D-galactose. OH C H OH C H (a) Draw the Haworth perspective formulas of the a and b forms of D-galactose. What feature distin- H C OH H C OH guishes the two forms? H C OH H C OH (b) Why does the optical rotation of a freshly pre- pared solution of the a form gradually decrease with CH OH CH OH 2 2 time? Why do solutions of the a and b forms (at equal Glucose Osazone derivative of glucose concentrations) reach the same optical rotation at equilibrium? The melting temperatures of these derivatives are eas- (c) Calculate the percentage composition of the ily determined and are characteristic for each osazone. two forms of D-galactose at equilibrium. This information was used to help identify monosac- charides before the development of HPLC or gas- 5. A Taste of Honey The fructose in honey is liquid chromatography. Listed below are the melting mainly in the b-D-pyranose form. This is one of the points (MPs) of some aldose-osazone derivatives: sweetest substances known, about twice as sweet as glucose. The b-D-furanose form of fructose is much less sweet. The sweetness of honey gradually de- MP of MP of creases at a high temperature. Also, high-fructose anhydrous osazone Monosaccharide monosaccharide (C) derivative (C) corn syrup (a commercial product in which much of the glucose in corn syrup is converted to fructose) is Glucose 146 205 used for sweetening cold but not hot drinks. What Mannose 132 205 chemical property of fructose could account for both Galactose 165–168 201 of these observations? Talose 128–130 201 6. Glucose Oxidase in Determination of Blood Glucose The enzyme glucose oxidase isolated from As the table shows, certain pairs of derivatives have the mold Penicillium notatum catalyzes the oxida- the same melting points, although the underivatized tion of b-D-glucose to D-glucono-d-lactone. This en- monosaccharides do not. Why do glucose and man- zyme is highly specific for the b anomer of glucose and nose, and galactose and talose, form osazone deriva- does not affect the a anomer. In spite of this speci- tives with the same melting points? ficity, the reaction catalyzed by glucose oxidase is commonly used in a clinical assay for total blood glu- 4. Interconversion of D-Galactose Forms A so- cose—that is, for solutions consisting of a mixture of lution of one stereoisomer of a given monosaccharide b- and a-D-glucose. How is this possible? Aside from rotates plane-polarized light to the left (counterclock- allowing the detection of smaller quantities of glucose, wise) and is called the levorotatory isomer, designated what advantage does glucose oxidase offer over (); the other stereoisomer rotates plane-polarized Fehling’s reagent for the determination of blood glu- light to the same extent but to the right (clockwise) cose? and is called the dextrorotatory isomer, designated (). An equimolar mixture of the () and () forms 7. Invertase “Inverts” Sucrose The hydrolysis of does not rotate plane-polarized light. sucrose (specific rotation 66.5) yields an equimolar The optical activity of a stereoisomer is expressed mixture of D-glucose (specific rotation 52.5) and quantitatively by its optical rotation, the number of D-fructose (specific rotation –92). degrees by which plane-polarized light is rotated on (a) Suggest a convenient way to determine the passage through a given path length of a solution of rate of hydrolysis of sucrose by an enzyme preparation the compound at a given concentration. The specific extracted from the lining of the small intestine. rotation [a]25C of an optically active compound is de- D (b) Explain why an equimolar mixture of D-glucose fined thus: and D-fructose formed by hydrolysis of sucrose is

observed optical rotation () called invert sugar in the food industry. [a] 25C D optical path length (dm) concentration (gmL) (c) The enzyme invertase (its preferred name is now sucrase) is allowed to act on a solution of sucrose The temperature and the wavelength of the light em- until the optical rotation of the solution becomes zero. ployed (usually the D line of sodium, 589 nm) must be What fraction of the sucrose has been hydrolyzed? specified in the definition. A freshly prepared solution of a-D-galactose (1 g/mL 8. Manufacture of Liquid-Filled Chocolates The in a 10 cm cell) shows an optical rotation of 150.7. manufacture of chocolates containing a liquid center is Over time, the observed rotation of the solution gradually an interesting application of enzyme engineering. The decreases and reaches an equilibrium value of 80.2. flavored liquid center consists largely of an aqueous so- In contrast, a freshly prepared solution (1 g/mL) of lution of sugars rich in fructose to provide sweetness. 324 Part II Structure and Catalysis

The technical dilemma is the following: the chocolate 14. Heparin Interactions Heparin, a highly nega- coating must be prepared by pouring hot melted choco- tively charged glycosaminoglycan, is used clinically as late over a solid (or almost solid) core, yet the final an anticoagulant. It acts by binding several plasma product must have a liquid, fructose-rich center. Sug- proteins, including antithrombin III, an inhibitor of gest a way to solve this problem. (Hint: Sucrose is much blood clotting. The 1:1 binding of heparin to an- less soluble than a mixture of glucose and fructose.) tithrombin III appears to cause a conformational 9. Anomers of Sucrose? Although lactose exists in change in the protein that greatly increases its ability two anomeric forms, no anomeric forms of sucrose to inhibit clotting. What amino acid residues of an- have been reported. Why? tithrombin III are likely to interact with heparin? 15. Information Content of Oligosaccharides 10. Physical Properties of Cellulose and Glycogen The carbohydrate portion of some glycoproteins may The almost pure cellulose obtained from the seed serve as a cellular recognition site. In order to perform threads of Gossypium (cotton) is tough, fibrous, and this function, the oligosaccharide moiety of glycopro- completely insoluble in water. In contrast, glycogen teins must have the potential to exist in a large variety obtained from muscle or liver disperses readily in hot of forms. Which can produce a greater variety of struc- water to make a turbid solution. Although they have tures: oligopeptides composed of five different amino markedly different physical properties, both sub- acid residues or oligosaccharides composed of five dif- stances are composed of (1n4)-linked D-glucose poly- ferent monosaccharide residues? Explain. mers of comparable molecular weight. What structural features of these two polysaccharides underlie their 16. Determination of the Extent of Branching in different physical properties? Explain the biological Amylopectin The extent of branching (number of advantages of their respective properties. (a1n6) glycosidic bonds) in amylopectin can be de- termined by the following procedure. A sample of 11. Growth Rate of Bamboo The stems of bamboo, amylopectin is exhaustively treated with a methylating a tropical grass, can grow at the phenomenal rate of agent (methyl iodide) that replaces all the hydrogens 0.3 m/day under optimal conditions. Given that the of the sugar hydroxyls with methyl groups, converting stems are composed almost entirely of cellulose fibers OH to OCH . All the glycosidic bonds in the oriented in the direction of growth, calculate the num- X X 3 treated sample are then hydrolyzed in aqueous acid. ber of sugar residues per second that must be added The amount of 2,3-di-O-methylglucose in the hy- enzymatically to growing cellulose chains to account drolyzed sample is determined. for the growth rate. Each D-glucose unit in the cellu- lose molecule is about 0.45 nm long. 12. Glycogen as Energy Storage: How Long Can CH OH a Game Bird Fly? Since ancient times it has been 2 O observed that certain game birds, such as grouse, HH quail, and pheasants, are easily fatigued. The Greek H historian Xenophon wrote, “The bustards . . . can be OCH3 H HO OH caught if one is quick in starting them up, for they will fly only a short distance, like partridges, and soon tire; H OHC 3 and their flesh is delicious.” The flight muscles of game 2,3-Di-O-methylglucose birds rely almost entirely on the use of glucose 1-phos- phate for energy, in the form of ATP (Chapter 15). In game birds, glucose 1-phosphate is formed by the breakdown of stored muscle glycogen, catalyzed by (a) Explain the basis of this procedure for deter- the enzyme glycogen phosphorylase. The rate of ATP mining the number of (a1n6) branch points in amy- production is limited by the rate at which glycogen can lopectin. What happens to the unbranched glucose be broken down. During a “panic flight,” the game residues in amylopectin during the methylation and bird’s rate of glycogen breakdown is quite high, ap- hydrolysis procedure? proximately 120 mmol/min of glucose 1-phosphate pro- (b) A 258 mg sample of amylopectin treated as de- duced per gram of fresh tissue. Given that the flight scribed above yielded 12.4 mg of 2,3-di-O-methylglu- muscles usually contain about 0.35% glycogen by cose. Determine what percentage of the glucose weight, calculate how long a game bird can fly. residues in amylopectin contain an (a1n6) branch. 13. Volume of Chondroitin Sulfate in Solution 17. Structural Analysis of a Polysaccharide A One critical function of chondroitin sulfate is to act as polysaccharide of unknown structure was isolated, a lubricant in skeletal joints by creating a gel-like subjected to exhaustive methylation, and hydrolyzed. medium that is resilient to friction and shock. This Analysis of the products revealed three methylated function appears to be related to a distinctive property sugars in the ratio 2011. The sugars were 2,3,4- of chondroitin sulfate: the volume occupied by the tri-O-methyl-D-glucose; 2,4-di-O-methyl-D-glucose; and molecule is much greater in solution than in the dehy- 2,3,4,6-tetra-O-methyl-D-glucose. What is the structure drated solid. Why is the volume occupied by the mol- of the polysaccharide? ecule so much larger in solution?