<<

Research 344 (2009) 2157–2166

Contents lists available at ScienceDirect

Carbohydrate Research

journal homepage: www.elsevier.com/locate/carres

Twisting of glycosidic bonds by hydrolases

Glenn P. Johnson a, , Luis Petersen b, , Alfred D. French a, Peter J. Reilly b,* a Southern Regional Research Center, Agricultural Research Service, U.S. Department of Agriculture, 1100 Robert E. Lee Boulevard, New Orleans, LA 70124-4305, USA b Department of Chemical and Biological Engineering, 2114 Sweeney Hall, Iowa State University, Ames, IA 50011-2230, USA article info abstract

Article history: Patterns of scissile bond twisting have been found in crystal structures of hydrolases (GHs) that Received 6 July 2009 are complexed with substrates and inhibitors. To estimate the increased potential energy in the sub- Received in revised form 7 August 2009 strates that results from this twisting, we have plotted torsion angles for the scissile bonds on hybrid Accepted 11 August 2009 Quantum Mechanics::Molecular Mechanics energy surfaces. Eight such maps were constructed, including Available online 14 August 2009 one for a- and three for different forms of methyl a-acarviosinide to provide energies for twisting of a-(1,4) glycosidic bonds. Maps were also made for b-thiocellobiose and for three b- conform- Keywords: ers having different glycon ring shapes to model distortions of b-(1,4) glycosidic bonds. Different GH fam- Conformational analysis ilies twist scissile glycosidic bonds differently, increasing their potential energies from 0.5 to 9.5 kcal/ Glycoside hydrolases Glycosidic bonds mol. In general, the direction of twisting of the glycosidic bond away from the conformation of lowest Intramolecular energy intramolecular energy correlates with the position (syn or anti) of the proton donor with respect to the Molecular mechanics glycon’s ring atom. This correlation suggests that glycosidic bond distortion is important for Quantum mechanics the optimal orientation of one of the glycosidic oxygen lone pairs toward the ’s proton donor. Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction energy barrier to cleavage, although strain was ruled out as the ma- jor contribution toward lowering the reaction barrier. Electrostatic It has been known for many years that the substrate carbohy- interactions are proposed to have a more important role.7 drate residue located immediately to the non-reducing side of There is now a sufficient number of GH crystal structures with the scissile glycosidic bond (in enzyme subsite 1) is distorted bound in active sites to address the question of the during the cleavage reaction catalyzed by many glycoside effect of GH–ligand binding on scissile bond geometry. Our ap- hydrolases (GHs).1 Evidence of ring conformations of higher intra- proach was to plot the available data for different a- and b-glyco- 4 molecular energy than for the optimal C1 shape has become sidic linkages involving , oxygen, and sulfur [NOS] on overwhelming in recent years as crystal structures of GH–ligand maps of EIntra values. Thus, we have plotted experimental values complexes have been solved.2,3 of / (O50–C10–[NOS]4–C4) and w (C10–[NOS]4–C4–C5) (Fig. 1)on A second distortion, that of the torsion angles for the two bonds energy surfaces that are based on hybrid Quantum Mechan- connected to the glycosidic oxygen atom, has been discussed much ics::Molecular Mechanics (QM::MM) computations in an attempt less. One article was devoted to twisting distortions for all linkages to gain new insights on this question. in oligo- and bound by proteins.4 Most linkages had low-energy conformations, but some scissile linkages corre- sponded to high energies. Distortion of substrate torsion angles 2. families and mechanisms by GHs was also mentioned by Fujimoto et al.5 Several different cleavage functions could be enhanced by twist- GHs have been classified into >100 families based on similar- 8 ing glycosidic bonds. Distortion of the scissile bond can improve ac- ities in their primary structures. Tertiary structures, that is, the cess of the catalytic amino acid residues to the glycosidic oxygen overall arrangement of helices, strands, and loops, are expected atom and the C10 atom of the non-reducing-side carbohydrate resi- to be quite similar within individual families. Crystal structures due. Changes in the electronic structure resulting from torsional are known for with a ligand bound in subsites 1 and 6 changes could enhance reactivity. Also, an increase of EIntra, the rel- +1 that belong to 14 families that cleave either a-1,4 or b-1,4 gly- ative intramolecular energy above that for the lowest-energy con- cosidic bonds. These subsites contain the glycosyl residues to the formation of the scissile bond torsion angles, could reduce the immediate non-reducing and reducing sides, the glycon and aglycon, respectively, of the scissile glycosidic bond (Table 1 and Supplementary data). Members of five families hydrolyze * Corresponding author. Tel.: +1 515 294 5968; fax: +1 515 294 2689. E-mail address: [email protected] (P.J. Reilly). a-glycosidic bonds, while members of the other nine cleave Contributed equally to the work. b-glycosidic bonds.

0008-6215/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.carres.2009.08.011 2158 G. P. Johnson et al. / Carbohydrate Research 344 (2009) 2157–2166

Maltose Cellobiose

4C O5’ 4C 4C 1 1 1 C1’

O4 C4 4C 1 C5 4C 1 2S O Methyl -acarviosinide

2H 3 4C 1

4C Thiocellobiose 1

4C 1 1S 4C 3 1

Figure 1. Structures of molecules used to construct energy contour maps. Ring conformations are denoted with labels. Superscripts and subscripts denote the atoms above or below the plane of the ring, respectively.

Occasionally, different GH families appear to be distantly re- ken and a hydroxyl group from the coordinated molecule re- lated to each other by having similar tertiary structures and mech- places it, inverting the configuration of the carbohydrate residue. anisms, and therefore they are grouped into clans.8 Among the The remaining proton from the water molecule is accepted by families listed in Table 1, GH families 2 and 5 (GH2 and GH5) are the dissociated carboxyl group. part of Clan GH-A, while GH13 and GH77 belong to Clan GH-H. In the retaining mechanism, the protonated carboxyl group do- Other families listed in Table 1 are either the only representatives nates a proton to the glycosidic oxygen atom, and the dissociated of their clans with known tertiary structures containing glucosyl or carboxyl group forms a covalent bond with the C10 atom. The bond -derived carbohydrate ligands in subsites 1 and +1, or between the C10 atom and the glycosidic oxygen atom is broken they have not been classified into clans. and a hydroxyl group from the water molecule hydrates the C10 GHs hydrolyze glycosidic bonds by two general mechanisms,9 atom, breaking the covalent bond with the carboxyl group. This with some exceptions. In the inverting mechanism, a carboxyl double-displacement reaction retains the anomeric configuration group on an amino acid side-chain donates a proton to the scissile of the glycon. Of the families listed in Table 1, five have inverting glycosidic oxygen atom. Meanwhile, the oxygen atom of a water mechanisms and nine have retaining mechanisms. molecule coordinated by a dissociated carboxyl group on a second The protonated carboxyl group is usually positioned laterally on amino acid side-chain forms a partial bond with the C10 atom of the either side of the O4 atom (the glycosidic oxygen atom). The differ- carbohydrate ring in subsite 1. The complex proceeds through ent positions of the proton donor are labeled as syn or anti.10 the transition state with the bond between O50 and C10 atoms Syn-proton donors are on the same side as the O50 atom (the glycon assuming partial double bond character. This causes the C50–O50– ring oxygen atom), while anti proton donors are on the opposite C10–C20 atoms to form a plane. Finally, the glycosidic bond is bro- side (Fig. 2).

Table 1 GH families containing enzymes with crystal structures of bound in subsites 1 and +1

GH GH GH family members No. of Catalytic Ligand glycon Ligand bond Position of family clan structures mechanism conformation configuration proton donor

4 2Ab-Galactosidase 1 R C1 b-(1,4) anti 4 3—b- glucohydrolase 2 R C1 b-(1,4) anti 4 5 A Endocellulase, EG 4 R C1 b-(1,4) anti 4 2 6 — CBH, EG 3 I C1, SO b-(1,4) syn 7 B CBH, EG 1 R Distorted 1,4B b-(1,4) syn 8MEG 1I 2,5B b-(1,4) anti 9 — CBH 1 I 4E b-(1,4) syn 1 12 C EG 2R S3 b-(1,4) syn 2 4 13 H CGTase, a-, neopullulanase, glucan 1,4-a-maltohexaosidase, 42 R H3, C1 a-(1,4) syn amylosucrase, 4-a-glucanotransferase, maltogenic amylase 4 14 — b-Amylase 4 I C1 a-(1,4) syn 2 2 15 L Glucoamylase, glucodextranase 5 I H3, E a-(1,4) syn 1 44 — EG 1R S3 b-(1,4) anti 2 57 — 4-a-Glucanotransferase 1 R H3 a-(1,4) anti 4 77 H 4-a-Glucanotransferase 1 R C1 a-(1,4) anti

Abbreviations: CBH: cellobiohydrolase; CGTase: cyclomaltodextrin glucanotransferase; EG: endo-1,4-b-glucanase; GH: glycoside hydrolase; I: inverting; R: retaining. G. P. Johnson et al. / Carbohydrate Research 344 (2009) 2157–2166 2159

Figure 2. The different locations of the proton donor in glycoside hydrolases. The proton donor is labeled as syn if it is on the same side of the glycon ring oxygen atom. If it is on the opposite side, it is labeled as anti. Dihedral twisting is different for syn (right) than for anti (left).

3. Methods investigated further. The residues in subsite +1 were not so re- stricted, and all mutated enzyme forms were considered equally. 3.1. Gathering data on individual structures As mentioned earlier, many GHs, especially those acting on b- glycosidic bonds, preactivate the substrate for catalysis by distort- Crystal structures of GHs with ligands spanning subsites 1 and ing the ring bearing the scissile bond away from the most 8 4 2,3,12,13 +1 (the cleavage site) were gathered through the CAZy database. stable C1 form. Crystal structures of GH–ligand complexes They were visualized with PyMOL,11 and their / and w values were show several different glycon conformations (Table 1). Many non- determined. Publications associated with the crystal structures optimal ligand ring conformations are maintained by being part of were consulted to confirm the ligand placement in the active site putative transition-state analogs such as acarbose, a potent inhib- and glycon conformation. When more than one catalytic domain itor of and other enzymes that hydrolyze the a-(1,4) gly- occurred in the crystal structure, the torsion angles of their bound cosidic bonds of and its partial hydrolyzates. The non- 2 ligands were averaged. reducing-end ring of acarbose is a H3 conformer maintained by The 68 GH structures (Supplementary data) with ligands having a double bond between its C50 and C70 atoms, the latter replacing a-orb-(1,4) glycosidic bonds over the enzyme cleavage site and the O50 atom normally found in pyranosyl rings. In addition, a with glucosyl or acarbose-derived residues in subsite 1 were protonated nitrogen atom replaces the glycosidic oxygen atom

60 30 3

5

6 0 25

30

26 20 60 28 4 20 12 8 24 2 3 1 16 18 120 15 12 16 (degrees) 18 14 12

16 180 18 10 2

3 10 20 8

18 240 18 5 3 12

2

300 0 60 0 60 120 180 240 300 (degrees)

4 Figure 3. EIntra contour map of a-maltose with its two a-glucosyl residues in C1 conformations, and with crystal-structure ligand torsion angles plotted on it. Green, GH13; yellow, GH14; purple, GH77. 2160 G. P. Johnson et al. / Carbohydrate Research 344 (2009) 2157–2166

4 and a methyl group replaces the methanol group at the C5 atom of bonds and glycon C1 conformers linked by a sulfur atom; (5) a 1 the residue to the reducing side of the scissile bond. Often, GHs map of b- S3-cellobiose (Fig. 7) for GH9, GH12, and GH44 enzymes 1 incubated with acarbose retain sufficient activity to rearrange it that bind ligands with b-(1,4) scissile bonds and glycon S3 con- 2 into various derivatives consisting of acarviosinose (the two non- formers; and (6) a map of b- SO-cellobiose (Fig. 8) for a GH8 mem- 2 glucosyl residues of acarbose) (Fig. 1) and different numbers of glu- ber that binds ligands with b-(1,4) scissile bonds and glycon SO cosyl residues to either side of this fragment. conformers. Most 1,4-glucosyl-acting GHs with known tertiary structures bind six types of ligands (Fig. 1), leading to construction of eight 3.2. Energy determination hybrid QM::MM maps of EIntra at various //w values of the bound ligands: (1) a map for a-maltose (Fig. 3, with energies taken from A detailed description of the non-integral hybrid QM::MM Johnson, et al.14) for GH13, GH14, and GH77 enzymes that bind li- method is provided elsewhere.15 This method, when used with 4 gands with a-(1,4) scissile bonds and glycon C1 conformers; (2) an elevated dielectric constant, has successfully predicted the con- three maps of the R-, S-, and positively-charged forms of methyl formations of numerous linkages in small-molecule a-acarviosinide (Fig. 4) for GH13, GH15, and GH57 members that crystal structures, typically finding a majority within the 1 kcal/ 2 14,16,17 bind ligands with a-(1,4) scissile bonds and glycon H3 conform- mol contour, and with no distortions greater than 4 kcal/ 4 15 ers; (3) a map of b- C1-cellobiose (Fig. 5) for GH2, GH5, GH6, and mol. Furthermore, the distribution of energies follows a Boltz- GH7 enzymes that bind ligands with b-(1,4) scissile bonds and gly- mann-like exponential decay curve, indicating that the method is 4 18 con C1 conformers; (4) a map of b-thiocellobiose (Fig. 6) for GH3, predictive for condensed-phase systems. The elevated dielectric GH5, and GH6 members that bind ligands with b-(1,4) scissile constant diminishes by 80% the otherwise dominating influence

60 60 4 A 3 B 6 10 6 8 0 16 0 20 8 30 8 28 8 18 26 5 10 4 18 5 60 5 60 3 18 12 14 2 1 8

3 3 10 120 1 120 5 10 8 5

3 180 20 180 4 14 24 14 12 28 24 8 5 10 20 240 240 16 8 18 12 8 6 4 16 3 3 300 300 60 0 60 120 180 240 300 60 0 60 120 180 240 300

60 30 (degrees) 3 C 10 6 18 14 0 25 20 12

10

60 20 6 5 28

2 20 3 12 120 1 3 15 8 8 8 12 180 12 18 10 22 16 26 18 10 240 8 5

5 6 4 300 0 60 0 60 120 180 240 300 (degrees)

2 4 Figure 4. EIntra contour map of methyl a-acarviosinide with its glycon in the H3 conformation and its aglycon in the C1 conformation, and with crystal-structure ligand torsion angles plotted on it. (A) Methyl a-R-acarviosinide; (B) methyl a-S-acarviosinide; (C) methyl a-acarviosinide-H+. Green, GH13; cyan, GH15; red, GH57. G. P. Johnson et al. / Carbohydrate Research 344 (2009) 2157–2166 2161

60 4 25 6 8 0

8 12 20 16

60 6 24 5 20 2 15 14 6

1 5 8 4 120 6 4

8 (degrees) 2 10

3 180 6 8 4

12

8 5 20 14 10 5 240 4 5 3

300 0 300 240 180 120 60 0 60 (degrees)

4 Figure 5. EIntra contour map of b-cellobiose with its two b-glucosyl residues in C1 conformations, and with crystal-structure ligand torsion angles plotted on it. Cyan, GH2; pink, GH5; red, GH6; yellow, GH7.

60 1 25 4 6 0

8 20 26 10 18 16 12 60 5 6

2 15

4

8

4 4 2 1 120 6 8 6 2 3 4 (degrees) 5 5 10 8 180 12

20 16 10 5 240 4 3

2 1

5 8 5

300 0 300 240 180 120 60 0 60 (degrees)

4 Figure 6. EIntra contour map of b-thiocellobiose with its two b-glucosyl residues in C1 conformations, and with crystal-structure ligand torsion angles plotted on it. Green, GH3; pink, GH5; red, GH6. of intramolecular hydrogen bonds without the need to explicitly of a tetrahydropyran (THP) dimer. For the methyl a-acarviosi- 1 2 consider neighboring molecules. These hybrid maps are con- nide, b-thiocellobiose, b- S3-cellobiose, and b- SO-cellobiose structed from three different //w surfaces with energies calculated maps, the hydroxyl groups of the full disaccharide were at 20° intervals of / and w: replaced by hydrogen atoms, giving a dimer of methyl THP (methyl cyclohexene and methyl THP for the methyl a-acarvi- 1. A QM relative energy map to describe the core part of the mol- osinide analog). ecule that includes the sugar rings and glycosidic linkage (the 2. An MM relative energy map of the analog molecule as described 4 analog). For a-maltose and b- C1-cellobiose, the analog consists above. 2162 G. P. Johnson et al. / Carbohydrate Research 344 (2009) 2157–2166

60 50 4

6 45

0 26 18 40 40 8 50

35 60 45 5 6 40 3 30 14

30 2 10 120 25 22

1

10 (degrees) 18 20

18

30 180 8 4 2 3

04 10 15 5 20 14 30 10 240 24

5 6 4 300 0 300 240 180 120 60 0 60 (degrees)

1 4 Figure 7. EIntra contour map of b-cellobiose with its glycon and aglycon b-glucosyl residues in S3 and C1 conformations, respectively, and with crystal-structure ligand torsion angles plotted on it. Cyan, GH9; red, GH12; green, GH44.

60 4 6 25 8 0 10 16 22 28 8 20 6 60 26 22 18 2 14 10 15 120 8 8

(degrees) 4 5 3 6 1 10 180 2 8

20

14

5

8 10 20 14 240 5 5 4

300 0 300 240 180 120 60 0 60 (degrees)

2 4 Figure 8. EIntra contour map of b-cellobiose with its glycon and aglycon b-glucosyl residues in SO and C1 conformations, respectively, and with crystal-structure ligand torsion angles plotted on it. Green: GH8.

3. An MM relative energy map of the full disaccharide to account map. This effectively replaces the MM energy contribution from for the interaction energy of the hydroxyl groups. the core of the molecule with that of QM calculations. The non-integral hybrid method was advantageous in the present The hybrid map is constructed from the above-mentioned work because parameters for the MM force field were not fully relative energy maps by subtracting the MM analog map from developed for the thiocellobiose and methyl a-acarviosinide the MM full disaccharide map and adding the QM analog molecules. G. P. Johnson et al. / Carbohydrate Research 344 (2009) 2157–2166 2163

3.2.1. QM analog maps sume a range of values, depending on the values of / and w,itis

The QM analog maps were constructed from QM calculations necessary to scan a range of the related wh torsion angle. To do this, using GAMESS19 or Jaguar version 7.020 software. The QM calcula- a set of values ranging from 120° to 240° in 5° increments was sub- 4 tions, except for those for the b- C1-cellobiose analog, were done at tracted from the wh torsion angle after w was set. The terminal * 4 the HF/6-31G level of theory. b- C1-cellobiose analog calculations hydrogen atom attached to the nitrogen atom of the wh torsion an- used B3LYP/6-311++G** theory, based on structures optimized at gle was set in a way that only the terminal hydrogen atom was the B3LYP/6-31G* level. Constraints were used to maintain / and moved. This effectively puts the improper torsion angle in the w at each increment. range of 120° to +120°, including ±180°, thus sampling the entire Additional steps were needed for some of the analog computa- chiral range. However, since a chiral assignment cannot be made 1 2 tions. For the b- S3- and b- SO-cellobiose analog calculations, three on the ±180° configurations, these were not included on the maps. extra torsion constraints were placed on the non-reducing ring to This range was not as broad as that of the QM harmonic constraint. 1 maintain the desired conformation. Also, some of the b- S3- and However, the QM data showed that the effective limits of the im- 2 b- SO-cellobiose analogs in high-energy regions of //w space re- proper torsion were 120° to 175° and 120° to 175° for R and S quired a constraint on the glycosidic angle to achieve convergence chirality, respectively. Each of the resulting wh conformations of the minimization. The values of those constrained angles were was used to calculate the data points at each required //w value. derived from the comparable MM calculation. Finally, some of The lowest energy conformation at each chiral configuration for 1 2 the b- S3- and b- SO-cellobiose analog calculations required con- each //w point was then used for the respective MM3 analog maps. 4 straints on the reducing ring to maintain its C1 conformation. The other QM analog calculations that needed special handling 3.2.3. MM disaccharide maps were those for the methyl a-acarviosinide analog, specifically the R The MM disaccharide maps were constructed from calculations and S chiral configurations of the neutral molecules. The single with MM3(96), except that the hydrogen bonding parameters were hydrogen atom and the corresponding lone pair attached to the from the MM3(92) version of the force field and the dielectric con- linkage nitrogen of the neutral molecules make the linkage nitro- stant was set to 7.5. For the chiral methyl a-acarviosinide maps, gen a chiral atom. However, a constraint was needed on the hydro- the wh constraint used at each //w point of the disaccharide map gen atom to ensure sampling of both chiral configurations. This was that of the respective MM analog map at the equivalent //w was accomplished by using the harmonic constraint capability of point. A set of starting conformations was generated, based on Jaguar and an improper dihedral angle as a proxy for the chirality. variations of the exo-cyclic group orientations, and each resulting A harmonic constraint was placed on the C4–C10–N4–H improper conformation was considered at each //w point of the map. The dihedral angle of the molecules with R and S chiral configurations. lowest energy conformation at each //w point was used for the The value of the constraint was 135° for the R configuration and respective maps. +135° for the S configuration with a well half-width of 40°. Thus, Energies for the a-maltose map4 were based on a set of 58 confor- the value of the improper dihedral was constrained to the value mations as described in previous work.22 A total of 2187 conforma- of the target ±40°. The force constant of the constraint was tions for the methyl a-acarviosinide molecules were generated by 10 kcal mol1, which sufficed for most of the calculations at the respective //w points. In some cases, the constraint was not strong 1. Setting the O5–C1–O1–CH3 torsion angle to 60°. enough and the improper dihedral would reverse the sign (and chi- 2. Systematically rotating the remaining 7 rotatable exo-cyclic rality) by crossing the ±180° value. In these cases, a progressively groups to values of 60°, 180°, and 60°. higher constraint force constant was used until the constraint was held in the appropriate well or, in rare cases, the computation Cellobiose and thiocellobiose molecules possess more exo-cyclic was aborted. groups than methyl a-acarviosinide, so a conformational search procedure was employed for them. An outline of the procedure 3.2.2. MM analog maps follows: The MM analog maps were constructed with the MM3(96) pro- gram.21 The dielectric constant was set to 1.5, the value used when 1. A structure was either obtained from crystal coordinates or parameterizing MM3 and recommended for isolated molecules. sketched with Maestro20 and minimized with MacroModel ver- Variation of the dielectric constant has little effect on the analog sion 9.520 using the OPLS-200523 force field. maps but was chosen none-the-less for maximum compatibility 2. A Monte Carlo multiple minimum conformational search proto- with the QM calculations. As done during the QM calculations, col was set up as follows: the / and w values were held to their appropriate values via tor- a. All torsion angles, except those of the pyranosyl rings, were sion constraints in the MM3 dihedral driver routines. Also like set as search variables. the QM analog calculations, some of the molecular systems re- b. The rings were constrained with three alternating torsion 1 2 quired special handling. For the b- S3- and b- SO-cellobiose analog constraints to hold the appropriate ring shape during mini- 1 2 calculations, three extra torsion constraints were placed on the mization. The constraints for the S3 and SO rings had force 4 non-reducing ring to maintain it in the desired conformation. constants of 1000 kJ/mol, while the C1 rings had a softer However, unlike the QM calculations, no additional constraints force constant of 100 kJ/mol. were needed on either the glycosidic angle or the reducing ring. c. 30,000 Monte Carlo steps were performed. Since MM3 does not support constraining improper dihedral an- d. The number of variables altered in each step was 2–4. gles and does not have harmonic constraint capability, a different e. The energy window (energy above global minimum at the technique was needed to constrain the hydrogen atom on the link- time) for acceptance of a conformer was 50 kJ/mol age nitrogen of methyl a-acarviosinide to insure complete sam- (12 kcal/mol). pling of the chiral configurations. f. The least used structures were used as starting geometries In lieu of using an improper torsion constraint, a constraint was for subsequent Monte Carlo steps. placed on the wh torsion angle (C5–C4–N4–H). Since this hydrogen g. The similarity criteria included distance checks of atoms as atom is also the terminal atom of the improper torsion angle (C4– well as torsion angle differences involving polar hydrogen C10–N4–H), there is a direct relationship between these proper and atoms. The distance threshold was 0.25 Å and the torsion improper torsion angles. Since the improper torsion angle can as- threshold was set to 60°. 2164 G. P. Johnson et al. / Carbohydrate Research 344 (2009) 2157–2166

h. The structures generated at each Monte Carlo step that methyl a-acarviosinide-H+) are qualitatively very similar to each passed initial checks were minimized with OPLS-2005 for other, the main difference among them being the position of their up to 5000 steps. global minima (Fig. 4). The methyl a-S-acarviosinide and methyl 3. Step 2 was performed ten times, for a total of 300,000 Monte a-acarviosinide-H+ maps have global minima near / =70°, Carlo steps. This seemed to be sufficient to approach conver- w = 140°, while the methyl a-R-acarviosinide map has a global gence of the search. minimum near / = 130°, w = 90°. The global minimum regions 4. Each member of the resulting conformation set was trans- are smaller in all three methyl a-acarviosinide maps than in the formed as follows: a-maltose map, but a second low-energy region is present in all a. The glycosidic angle was set to 150°.18 of the former. All four maps have low-energy valleys running along b. The / and w torsion angles were set to those of the global 50° < / < 140°, with less deep ones in the range of 180° < minimum of the set. w < 100°. The methyl R- and S-a-acarviosinide maps are gener- 5. A redundant conformer elimination procedure was performed. ally similar to the previously published MM3 map that combined This used heavy atoms and polar hydrogen atoms for testing the energies for both forms, but the maps for the protonated form the maximal distance threshold of 0.25 Å between conforma- have some important differences.25 Those MM3 maps were based tions. This eliminates redundant conformations caused by nor- on a much smaller number of starting geometries as well as a low- malizing the linkage variables as described in step 4 above. er dielectric constant. The pure MM3 map for the protonated form has a global minimum at / =67° and w =42° (after converting

The numbers of conformations generated for b-cellobiose and b- from /H and wH), whereas the global minimum on the hybrid thiocellobiose molecules with the above procedure were: map is near / =70°, w = 140°. Although the global minimum on the hybrid map corresponds to a secondary minimum on the pure 4 b- C1-cellobiose: 1863. MM3 map, and vice-versa, a very low-energy minimum on the hy- 1 b- S3-cellobiose: 2871. brid map has no close counterpart on the published pure MM3 2 b- SO-cellobiose: 3485. map. The observed crystal structure conformations are clustered b-thiocellobiose: 2277. around the global minimum on the hybrid map.

These numbers of starting structures are far greater than in any 4.1.2. b-1,4 Glycosidic bonds 4 of our previous calculations. The b- C1-cellobiose map has two low-energy valleys (Fig. 5). The first one runs along 130° < / < 50°, while the second one 4. Results and discussion runs along 160° < w < 80°. The global minimum is near / = 60°, w = 120°. Energy calculations on this new map differ from 4,16,17 4.1. Energy contour maps those on previous QM::MM3 cellobiose maps in that the number of starting geometries was greatly expanded and a dielec- 16,17 Eight //w energy contour maps were computed using the hy- tric constant of 7.5 was employed instead of 3.5 as previously. . brid QM::MM technique to measure the strain on the glycosidic Still, the features of these maps are quite similar. 4 bond upon twisting. Torsion angles obtained from Protein Data The b-thiocellobiose map is similar to the b- C1-cellobiose map, Bank crystal structures of GHs with bound ligands spanning the having the same valleys running along / and w (Fig. 6). However, cleavage site were plotted on the corresponding //w energy con- the b-thiocellobiose map has two widely separated minima of al- tour maps to gain insight into the glycosidic bond distortion that most the same EIntra value. The first one lies near / < 60°, 4 occurs upon binding by the enzyme’s active site. w < 120°, as in the b- C1-cellobiose map, while the second one For a-1,4 glycosidic bonds we constructed an a-maltose and is near / < 70°, w < 290°. three methyl a-acarviosinide energy contour maps (Figs. 3 and Distortion of the glycon ring significantly reduces the allowed 4). Maltose oligomers are the natural substrates of members of (low-energy) //w regions. The difference is most noticeable in 1 several GH families, while methyl a-acarviosinide is a powerful the b- S3-cellobiose energy contour map, where the low-energy re- inhibitor of maltose- and starch-degrading enzymes.24 The glyco- gion is contained only within 190° < / < 70°, 200° < w < 60° 1 sidic nitrogen atom of methyl a-acarviosinide, when singly pro- (Fig. 7). The global minimum in the b- S3-cellobiose map is shifted 4 tonated, may have either the R or the S chiral configuration. It 40° down in both / and w directions compared to the b- C1-cello- can also be doubly protonated, making the molecule positively biose map, and the energy in other regions is much higher than 4 charged. X-ray crystal structure studies of proteins are not usually that in the b- C1-cellobiose map, with energy values >40 kcal/mol able to provide hydrogen atom positions, so the protonation state above the global minimum at the mountaintops. The main reason 1 of methyl a-acarviosinide could not be determined. Therefore, we for the high energy values in the b- S3-cellobiose map is the steric plotted methyl a-acarviosinide torsion angle measurements on all clash between the bulky C6-OH hydroxyl groups of both sugar 1 three maps (methyl a-R-acarviosinide, methyl a-S-acarviosinide, rings, which occurs more often with the S3 disaccharide because and methyl a-acarviosinide-H+). the aglycon ring is more axial to the glycon. 2 Similarly, for b-1,4 glycosidic bonds we constructed maps of b- In the b- SO conformation the aglycon is pseudo-axial to the gly- 4 2 cellobiose and b-thiocellobiose with C1 glycon conformations con. This causes C6-OH clashes in b- SO-cellobiose to be less fre- 1 (Figs. 5 and 6). As mentioned above, many (as well as quent than in b- S3-cellobiose, and the energy range from global 4 2 other GHs) distort the glycon away from the most stable C1 con- minimum to mountaintop is not as pronounced in b- SO-cellobiose 1 2 formation. The most common distortion observed in crys- as in b- S3-cellobiose (Fig. 8). The global minima in the b- SO- and 1 2 4 tal structures is to a S3 (or nearby) conformation (Fig. 7). The SO the b- C1-cellobiose maps are in very similar positions (contained (or nearby) conformation has been found in some inverting cellu- within 100° < / < 50° and 180° < w < 100°). lases (Fig. 8). 4.2. Ligand intramolecular energies and torsion angles 4.1.1. a-1,4 Glycosidic bonds The a-maltose energy contour map has a global minimum near Values of / and w from crystal structures are plotted on / = 100°, w = 140° (Fig. 3). The three methyl a-acarviosinide Figs. 3–8. GHs in different families that cleave a-orb-1,4 bonds maps (methyl a-R-acarviosinide, methyl a-S-acarviosinide, and twist these bonds to different values of / and w, corresponding G. P. Johnson et al. / Carbohydrate Research 344 (2009) 2157–2166 2165

Table 2 greater than the values found at the global minimum by 4.3 kcal/ EIntra values for GH families containing crystal structures of enzymes with oligosac- mol (Fig. 3). Wild-type GH13 enzymes bind a-maltose oligomers charides bound in subsites 1 and +1 with higher EIntra values, an average of 6.2 kcal/mol. For GH13 GH family Map Average EIntra in all members twisting the methyl a-acarviosinide glycosidic bond, structures (kcal/mol) the average value of EIntra is 3.6 kcal/mol. The one GH77 member 4 2 b- C1-Cellobiose 1.2 available for analysis imposes strain with an EIntra value of 3 b-Thiocellobiose 2 1.3 kcal/mol. 5 b-4C -Cellobiose 2 (4)a 1 All four GH14 members bind ligands with glycons in the 4C 5 b-Thiocellobiose 0.5 1 4 conformation. Their torsion angles are grouped on the -maltose 6 b- C1-Cellobiose 5.2 a 6 b-Thiocellobiose 1.8 map far from those of GH13 and GH77 members, near / = 140°, 4 7 b- C1-Cellobiose 1.5 w = 260° (Fig. 3). GH14 members distort the glycosidic bond to 2 8 b- SO-Cellobiose 2.5 (2.5) 1 EIntra values of 9.5 kcal/mol on average, imposing the greatest 9 b- S3-Cellobiose 1.5 (1.5) 1 scissile bond strain observed in GHs so far. Wild-type GH14 mem- 12 b- S3-Cellobiose 1.9 (1.9) 13 a-Maltose 4.3 (6.2) bers distort the glycosidic bond to the slightly higher EIntra value of 13 Methyl a-acarviosinide 3.6 9.7 kcal/mol. 14 a-Maltose 9.5 (9.7) The five GH15 members and one GH57 member bind ligands 15 Methyl a-acarviosinide 2.6 2 1 with glycons having H3 conformations, since methyl a-acarviosi- 44 b- S3-Cellobiose 2.1 (2.1) 57 Methyl a-acarviosinide 2.2 nide groups are found in subsites 1 and +1. Their torsion angles 77 a-Maltose 1.3 (1.3) are found around / = 100° and w = 120° to 110° (Fig. 4), distant

a from GH13, GH14, and GH77 members found in Figure 3 and GH13 In parentheses: Average EIntra of wild-type structures and those with slight mutations (e.g., Glu to Gln). members found in Figure 4. GH15 and GH57 enzymes bind methyl a-acarviosinide and twist the glycosidic bond torsions to EIntra val- ues of 2.6 and 2.2 kcal/mol, respectively. Figures 5–8 show the torsion angles of GHs that hydrolyze b-1,4 to different EIntra values. Table 2 shows the average EIntra value for glycosidic bonds. Because oxygen and sulfur glycosidic atoms yield each GH family. Values of EIntra shown in parentheses in Table 2 are substantially different valence angles and bond lengths, compari- of crystal structures of wild-type GHs or of those having slight sons between members of the same GH family (in this case, GH5 mutations (e.g., Glu to Gln) binding the natural substrate. Interest- and GH6) having ligands with different glycosidic atoms should ingly, those enzymes with more radical mutations bind ligands be made with caution. with the same or lower EIntra values than the average EIntra of the On average, GHs that hydrolyze b-1,4-glycosidic bonds force family, suggesting that these mutations interfere with the en- their ligands to have torsion angles that yield lower EIntra values zyme’s ability to twist glycosidic bonds optimally for reaction. than those imposed by GHs that hydrolyze a-1,4-glycosidic GHs that contain a proton donor in the anti position twist the bonds. scissile bond in the opposite direction (with respect to the lowest GH5 members bind cello-oligosaccharides with an imposed energy point on the map) than GHs with a syn-positioned proton average EIntra value of 2 kcal/mol, with one of the GH5-bound li- 4 donor. In GHs that break a-1,4-glycosidic bonds, the data suggest gands lying very close to the global minimum of the b- C1-cellobi- a preference for ligands bound to GHs with an anti-positioned ose map and the other lying 4 kcal/mol above the minimum proton donor that lies to the left (lower / values) of the mini- (Fig. 5). There are also two GH5 structures that bind thiocello-oli- mum (Figs. 3 and 4). Ligands bound to GHs with a syn-positioned gosaccharide ligands with an average EIntra value of 0.5 kcal/mol proton donor lie to the right (higher / values) of the minimum. (Fig. 6). For GH2, which is part of the same Clan A as GH5, there For GHs that break b-1,4 glycosidic bonds, the opposite is true. is only one available crystal structure for analysis, and it binds a Data for GHs with anti-positioned proton donors are located to molecule. The only difference between cellobiose and lac- the right of the lowest energy point on the maps, while the left tose is the configuration of the hydroxyl group attached to the gly- side of the minimum is populated by data from GHs containing con C4 atom, and their //w maps are very similar to each other.26 a syn-positioned proton donor. This suggests that twisting of The available data point for GH2 would fall near the minimum of 4 the glycosidic bond is necessary for the correct orientation of the b- C1-cellobiose map, at 1.2 kcal/mol (Fig. 5). one of the glycosidic oxygen lone pairs toward the proton donor. There are three GH6 data points, two for cello--

The only exceptions are the three GH13 outliers with a mutated bound complexes with an average EIntra value of 5.2 kcal/mol proton donor (see below) in the a-maltose map, one point in (Fig. 5) and one for a thiocello-oligosaccharide-bound complex the methyl a-acarviosinide map for a GH57 member, a GH2 with a significantly different / value and an EIntra value of member in the b-cellobiose map, and a GH44 member in the 1.8 kcal/mol (Fig. 6). The last point in the b-thiocellobiose map 1 b- S3-cellobiose map. However, the latter two are very close to is for a GH3-bound ligand with EIntra 2 kcal/mol. Only one struc- the global minimum. ture is available for GH7, and it binds a cello-oligosaccharide with

The majority of GH13 members and the one GH77 member, all EIntra 1.5 kcal/mol. 4 2 1 part of Clan H, bind ligands of both C1 (a-maltose) and H3 GH12 and GH44 members bind cello-oligosaccharides with S3- (methyl a-acarviosinide) glycon conformations with torsion angles distorted sugar rings to their active sites. GH9 members twist the of 10° < / <60° and 170° < w < 130° (Figs. 3 and 4). This is con- sugar ring in the active site to a 4E conformation, which is very 1 sistent with the fact that methyl a-acarviosinide is an excellent close to the S3 conformation. Therefore, we included the GH9 data 1 inhibitor of maltose-degrading enzymes. However, three GH13 point in the b- S3-cellobiose energy contour map (Fig. 7). EIntra for structures have ligands with / values from 130° to 170° and with the GH9-bound cello-oligosaccharide substrate is 1.5 kcal/mol w values between 110° and 90° (Fig. 3). These are the only above the //w minimum. For GH12, EIntra 1.9 kcal/mol and for structures in which the catalytic proton donor (Glu257) has been GH44, EIntra 2.1 kcal/mol. There is only one available structure 2 mutated to alanine, which probably caused their different loca- to plot on the b- SO-cellobiose energy contour map (Fig. 8), and it 2 tions on the //w map. A superimposition of GH13 members shows is from GH8 with EIntra of 2.5 kcal/mol compared to the SO-cello- 1 a significantly different sugar binding position in the three outliers. biose minimum energy structure. In general, the EIntra values of S3- 2 Average values of EIntra for all GH13-bound maltose structures are or SO-distorted structures are not significantly different from the 2166 G. P. Johnson et al. / Carbohydrate Research 344 (2009) 2157–2166

4 EIntra values of undistorted ( C1) structures bound to other GHs that Acknowledgments break b-1,4 glycosidic bonds. L.P. and P.J.R. thank the U.S. Department of Agriculture for its 5. Conclusions financial support through the Biotechnology Byproducts Consor- tium and through National Research Initiative Grant 2007- We have calculated hybrid QM::MM //w energy maps for 35504-18252 from the USDA Cooperative State Research, Educa- a-maltose, methyl a-acarviosinide, and b-thiocellobiose, as well tion, and Extension Service. G.P.J. and A.D.F. were supported by as for b-cellobiose in three different ring shapes, to measure the normal funds of the USDA Agricultural Research Service. The strain on different glycosidic bonds upon twisting. We have also authors thank Wim Nerinckx (University of Ghent) for his very plotted data obtained from protein–oligosaccharide complex crys- helpful advice. tal structures to estimate the protein-imposed strain on their sub- strates. In general, members of the same GH family twist the Supplementary data scissile bond similarly, while members of different families twist the bond to different regions on the energy contour maps. The Supplementary data associated with this article can be found, in GH-imposed strain ranges from 0.5 to 9.5 kcal/mol. GHs that the online version, at doi:10.1016/j.carres.2009.08.011. break b-1,4 glycosidic bonds and have an anti-positioned proton donor distort the scissile glycosidic bond to higher / values with References respect to the minimum. GHs with syn proton donors twist the bond to lower / values. The opposite is true in GHs that break 1. Phillips, D. C. Sci. Am. 1966, 215, 78–90. 2. Davies, G. J.; Ducros, V. M.; Varrot, A.; Zechel, D. L. Biochem. Soc. Trans. 2003, 31, a-1,4 glycosidic bonds. 523–527. Distortion of the bond is mechanistically relevant, as it helps to 3. Vocadlo, D. J.; Davies, G. J. Curr. Opin. Chem. Biol. 2008, 12, 539–555. orient one of the lone pairs in the glycosidic oxygen atom toward 4. French, A. D.; Johnson, G. P.; Kelterer, A.-M.; Dowd, M. K.; Cramer, C. J. Int. J. the proton donor, assisting proton transfer. Quantum Chem. 2001, 84, 416–425. 5. Fujimoto, Z.; Takase, K.; Doui, N.; Momma, M.; Matsumoto, T.; Mizuno, H. J. Mol. Biol. 1998, 277, 393–407. Note added in proof 6. Tvaroška, I.; Bleha, T. Can. J. Chem. 1979, 57, 424–435. 7. Warshel, A.; Levitt, M. J. Mol. Biol. 1976, 103, 227–249. 8. Cantarel, B. L.; Coutinho, P. M.; Rancurel, C.; Bernard, T.; Lombard, V.; Henrissat, The conformation of the scissile linkage in thiocellopentaose B. Nucleic Acids Res. 2009, 37, D233–238. http://www.cazy.org. was related to an energy surface for thiocellobiose as part of an 9. Koshland, D. E. Biol. Rev. Cambridge Phil. Soc. 1953, 28, 416–436. 27 10. Heightman, T. D.; Vasella, A. T. Angew. Chem., Int. Ed. 1999, 38, 750–770. extensive study of inhibition by S-linked . In that work 11. DeLano Scientific, Palo Alto, CA. http://www.pymol.org. they showed HF/6-31G** maps for analogs of O- and S-linked malt- 12. Nerinckx, W.; Desmet, T.; Claeyssens, M. Arkivoc 2006, xiii, 1–27. ose and cellobiose, computed with the rigid residue method. Their 13. Biarnés, X.; Ardèvol, A.; Planas, A.; Rovira, C.; Laio, A.; Parrinello, M. J. Am. Chem. O-linked analog maps, based on ring fragments, were similar to our Soc. 2007, 129, 10686–10693. 14. Johnson, G. P.; Stevens, E. D.; French, A. D. Carbohydr. Res. 2007, 342, 1210–1222. previously published analog maps based on tetrahydropyran rings, 15. French, A. D.; Kelterer, A.-M.; Cramer, C. J.; Johnson, G. P.; Dowd, M. K. but with considerably higher peaks and saddle points that can be Carbohydr. Res. 2000, 326, 305–322. attributed to their rigid residues. Their map for the analog of thio- 16. Peralta-Inga, Z.; Johnson, G. P.; Dowd, M. K.; Rendleman, J. A.; Stevens, E. D.; French, A. D. Carbohydr. Res. 2002, 337, 851–861. cellobiose has greater differences from our map (not shown), with 17. French, A. D.; Johnson, G. P. 2004, 11, 5–22. ours having only a single central minimum. Their relaxed adiabatic 18. French, A. D.; Kelterer, A. M.; Johnson, G. P.; Dowd, M. K.; Cramer, C. J. J. Mol. maps for cellobiose and thiocellobiose have substantial differences Graphics Modell. 2000, 18, 95–107. 19. Schmidt, M. W.; Baldridge, K. K.; Boatz, J. A.; Jensen, J. H.; Koseki, S.; Matsunaga, from our corresponding maps. This is due to their use of the CSFF N.; Gordon, M. S.; Nguyen, K. A.; Su, S.; Elbert, S. T.; Windus, T. L.; Montgomery, force field and apparently, a dielectric constant of 1.0 instead of J.; Dupuis, M. J. Comput. Chem. 1993, 14, 1347–1363. http://www.msg.chem. our choices of a hybrid map and a dielectric value of 7.5. Addition- iastate.edu. 20. Schrödinger, LLC, New York. http://www.schrodinger.com. ally, their map for thiocellobiose is not fully relaxed, as indicated 21. Allinger, N. L.; Rahman, M.; Lii, J.-H. J. Am. Chem. Soc. 1990, 112, 8293–8307. by the difference in energy values and positions of the minima 22. Mendonca, S.; Johnson, G. P.; French, A.; Laine, R. A. J. Phys. Chem. A 2002, 106, on the w = 180° and w = +180° lines. Unlike their work, we did 4115–4124. 23. Jorgensen, W. L.; Tirado-Rives, J. J. Am. Chem. Soc. 1988, 110, 1657–1666. not plot the / and w values for the pentamer in the complex with 24. Sigurskjold, B. W.; Berland, R.; Svensson, B. 1994, 33, 10191– the Fusarium oxysporum endoglucanase because the resi- 10199. 4 25. Coutinho, P. M.; Dowd, M. K.; Reilly, P. J. Proteins: Struct. Funct. Genet. 1998, 27, due of that thiopentasaccharide in the –1 site is not in the C1 con- formation. Our procedure would have required an additional plot 235–248. 26. Rees, D. A.; Skerrett, R. J. Carbohydr. Res. 1968, 7, 334–348. with the experimental ring geometry. 27. Venter, G. A.; Matthews, R. P.; Naidoo, K. J. Mol. Simulat. 2008, 34, 391–401.