Chapter 2: Structure and Function Acid / Base chemistry is crucial for living organisms (pH control and acid/base ) water as Water as the reference acid with ageneric base, B:. an acid, B donates H O H B H O a proton Hydroxide is H Kb and pKb the conjugate base of water Water as the reference base with a generic acid HA. water as a base, H O H A accepts H O H A a proton H K and pK H a a Hydronium ion is the Curved arrows emphasize electron movement. conjugate acid of water

[H+][A: ] + [H O]  55.5 M [H ][A: ] 2 assumptions Keq = Ka = Keq[H2O] = [HA][H2O] [HA] T = 300 K

+ [A: ] pK = -(log K) (by definition) -log (Ka) = -log [H ] - log p(x) = -log (x) [HA] R=2.0cal/(mol-K) pK = pH - log [A: ] when [A: ] = [HA] R = 8.4 joule/(mol-K) a [HA] pKa = pH Also true. G = - 2.3RT logKa = 2.3RT (-log Ka) = 2.3RT (pKa) -G G=H-TS G = (constant)(pKa) Keq = 10 2.3 RT G = free energy H  bond energies G = 1.4 pKa  1 pKa (kcal/mole) G = 5.8 pKa  6 pKa (kj/mole) S  probabilities (randomness) 1 + -1 -2 -1 Weak acids (RCO2H, ROH, RSH, RNH3 , H3PO4, H2PO4 , HPO4 , H2CO3, HCO3 , etc.) + - B H Y B TS stronger H Y B H Y base Y B H weaker acid & base stronger acid & base weaker PE acid (more stable) endergonic G = B H Y

POR = progress of reaction

G = 1.4 pK  1 pK (kcal/mole) The equilibrium shifts towards the weaker conjugate a a acid and base (away from the stronger acid and G = 5.8 pKa  6 pKa (kj/mole) base). Weaker is more stable (think "less reactive").

+ - B H A Strong acids (HCl, HBr, HI, H2SO4, HNO3, etc.) stronger TS PE acid potential B H A weaker energy base B H A B H A G = exergonic A B H stronger acid & base weaker acid & base (more stable) POR = progress of reaction

2 Amino Acids with Nonpolar "R" groups (have two pKa's), All aa C chiral centers are S except (because of the sulfur)

O pKa=9.78 O pKa=9.87 O pKa=9.74 O pKa=9.74 pKa=2.35 pK =2.29 NH pKa=2.35 NH a NH3 3 3 pKa=2.33 NH HO HO 3 HO HO CH H H H H H3C H H3C CH2 H3C CH3 CH = name alanine valine leucine Gly = 3 letter code Ala Val G = 1 letter code A body pH  7.4 CH3 Leu V L

pK =9.76 pK =9.44 O a O pKa=9.18 O a O pKa=10.65

H2 pKa=2.32 pKa=2.16 pKa=2.43 pKa=1.95 2S NH3 NH3 NH3 N HO HO HO HO 3S H H H C H CH H C CH2 2 3 C CH H2 H 3 isoleucine phenylalanine trytophan N Trp Pro Ile Phe H P I F W

Some amino acids have an additional pKa. O pKa=9.74 O K O O a1 Ka2 pK pKa=2.33 H N a1 pKa2 NH3 3 C H3N C H2N C HO CH OH CH O CH O H R S CH2 R R H C C Our bodies need 20 amino acids to make our 3 H Met 2 M (maybe 22 with some selenium variations). 3 Amino Acids with Polar "R" groups body pH  7.4

O pKa=9.21 O pKa=9.10 O pK =10.25 a O pKa=8.84 pKa=2.19 pK =2.09 NH a NH pKa=2.19 pKa=2.1 3 2S 3 NH3 NH HO HO 3 HO HO 3R H C H C H H 2 H2C H O CH H C 2 3 OH OH H cysteine SH Ser pKa13 Thr pKa13 Cys pKa=8.33 NH Asn S C 2 N T pKa15 dimer pKa=9.11 O O O pKa=9.13 pKa=2.20 pK =2.17 NH3 a NH3 NH3 HO HO HO H N H H 2 H CH2 CH CH NH3 2 pKa15 2 S C H H HO O 2 pKa=10.13 S C HO Tyr H2 Gln cystine = 2 x cysteine Q Y O with linkage

Other relevant biological pKa values phosphoric acid carbonic acid H PO -2 -3 H CO -3 3 4 H2PO4 HPO4 PO4 2 3 HCO3 CO3 pKa=2.1 pKa=7.2 pKa=12.4 pKa=6.4 pKa=10.3

4 Amino Acids with Charged "R" groups (have three pKa's) body pH 7.4 O pKa=10.25 O pKa=9.90 O pKa=9.47 pKa=1.99 O pK =9.18 a pKa=2.19 NH pKa=2.10 3 NH3 NH HO 3 pKa=2.16 NH HO HO 3 HO HO H H H2C H O CH2 pK =4.07 CH H a 2 pK =10.79 H2 CH C a C 2 cysteine SH H C O 2 H Cys pK =8.33 H N CH 2 a OH asparatic acid 3 2 C pKa=3.90 Asp Glu Lys Essential AAs Nonessential AAs D E K Alanine Isoleucine O pKa=8.99 Leucine Asparagine pKa=1.82 O Lysine pKa=9.33 NH Methionine 3 pK =1.80 Cysteine HO a NH3 Phenylalanine Glutamicacid pK =12.48 HO a H Threonine Glutamine H2 H N CH2 H 2 C CH Glycine C 2 Valine Proline H2 H NH arginine N histidine Serine Arg pKa=6.04 H N NH His Tyrosine 2 R H Selenocysteine Ornithine All amino acids are "S" absolute configuration at the C position, except cysteine (because the sulfur atom changes the order of priorities). Isoleucine (3S) and theonine (3R) have a second chiral center. These are the starting points for our body's proteins. Their pKa's can change in an actual protein invironment due to nearby hydrophobic, hydrophilic and/or ionic groups.

5 extracellular blood pH  7.4 Henderson-Hasselbach Equation intracellular  6.8 [A ] stomach  1.5 - 3.5 pK = pH when [HA] = [A: ] pH = pKa+ log [HA] small intestines  8.5 a What do the amino acids look like? pH = 012 3 45678 9 10 11 12 13 14 pK 9.4 O pKa2 O a NH 3 NH2 C C R R R OH R O 100 1 1 250,000 [A] Typical aa ammonium 7.4 = 9.4 + log + 7.4 = 2 + log [A ] Typical aa carboxylic [HA ] acid ionization constant [HA] acid ionization constant [A] log + = (7.4 - 9.4) =-2.0 log [A ] = (7.4 - 2) = 5.4 [HA ] [HA] [A] -2.0 + = 10 = 1 / 100 [A ] = 105.4 = 2.5x105 = 250,000 / 1 [HA ] [HA] pKa10.8 pK 4 O a O NH 3 NH2 C C R R lysine R OH R O 2,500 1 asparatic acid pK =10.79 and 1 2,500 a [A] (third pK ) 7.4 = 10.8 + log glutamic acid a [HA+] [A ] (second pK ) 7.4 = 4 + log [A] a [HA] log = (7.4 - 10.8) = -3.4 [HA+] [A ] log = (7.4 - 4) = 3.4 [A] [HA] = 10-3.4 = 1 / 2,500 [HA+] [A ] = 103.4 = 2.5x103 = 2,500 / 1 [HA] pKa=7.2 pK =12.4 pKa2.1 a -2 -3 H PO -2 PO 3 4 H2PO4 H2PO4 HPO4 HPO4 4 1.0 1.6 6 extracellular blood pH  7.4 Henderson-Hasselbach Equation intracellular  6.8 stomach  1.5 - 3.5 [A ] pKa = pH when [HA] = [A: ] pH = pKa+ log small intestines  8.5 [HA] What do the amino acids look like? pH = 012 3 45678 9 10 11 12 13 14

pKa12.5 H2N R HN R

NH NH

H2N 126,000 H2N 1 histidine His pKa6 R [A] R 7.4 = 12.5 + log + H H arginine [HA ] N N [A] pKa=12.48 pK =6.04 log + = (7.4 - 12.5) = -5.1 a NH (third pK ) [HA ] (second pK ) NH a a [A] 1 25 = 10-5.1 = 1 / 126,000 [A] [HA+] 7.4 = 6 + log [HA+] pKa10.1 log [A ] = (7.4 - 6) = 1.4 OH O [HA] tyrosine pK =10.1 [A ] = 101.4 = 2.5x101 = 25 / 1 a [HA] R R (third pKa) 500 1 [A] 7.4 = 10.1 + log [HA+] Any pKa value can be shifted, left [A] log = (7.4 - 10.1) = -2.7 or right by its enzyme environment. More [HA+] hydrophobic regions will favor the neutral [A] = 10-2.7 = 1 / 500 forms (RCO2H, RNH2). A nearby opposite [HA+] charge will favor the ionic form (nearby pKa8 pK 13 positive favors negative and vice versa). An a cysteine SH OH open environment that allows access to water R S serine O pKa=8.33 R R is similar to the reference aqueous values (second pK ) threonine R a 8 1 1 (obtained in water). It is therefore hard to pKa13 340,000 [A ] (third pK ) determine the form of a functional group 7.4 = 8.3 + log a 7.4 = 13 + log [A ] [HA] [HA] (ionic or neutral) in a particular region of a [A] [A] log + = (7.4 - 8.3) = -0.9 log = (7.4 - 13) = -5.6 protein without knowing something about its [HA ] [HA+] structure. [A] [A] =10-0.9 =1/8 =10-5.6 =1/340,000 [HA+] [HA+] pK =6.4 a pKa=10.3 -2 H2CO3 HCO 3 HCO3 CO3 1 10 7 Henderson-Hasselbach Equation extracellular blood pH  7.4 intracellular  6.8 [A ] pK = pH when [HA] = [A: ] pH = pKa+ log stomach  1.5 - 3.5 a [HA] small intestines  8.5 What do the amino acids look like? pH = 012 3 45678 9 10 11 12 13 14 Typical aa ammonium O pKa4 pK 5 asparatic acid O a pKa9.4 acid ionization constant ratio = 250/1 and NH C 3 NH2 pKa8.4 glutamic acid C R R OH R O pKa5 R ratio = 10/1 (second pKa) ratio = 25/1 100 1 1 2,500 pKa7.4 ratio = 1/1

What happens to the pKa when... There is nearby negative charge? There isahydrophobic pocket? There isnearby positive charge? RH RH

O pKa? pKa? O O RH RH O RH O O pKa? RH C H C C H C R O R O C H C R O R O R O R O RH R H RH R H a. pK is higher a. pKa is higher a. pKa is higher a b. pK is similar b. pKa is similar b. pKa is similar a c. pK is lower c. pKa is lower c. pKa is lower a There is nearby negative charge? There is a hydrophobic pocket? There is nearby positive charge? RH RH

pKa? pKa? RH RH RH pK ? RH NH3 NH a NH3 NH2 2 NH R R NH3 2 R R R R RH RH RH RH a. pK is higher a. pKa is higher a. pKa is higher a b. pK is similar b. pKa is similar b. pKa is similar a c. pK is lower c. pKa is lower c. pKa is lower a

8 Because of resonance of the nitrogen lone pair with the C=O the bonds are planar. This is called the . S = single bond conformation (trans or cis)

resonance H O H O O O R N C R R N C R RC RC C C N C C N resonance N R N R O H R H O H R H H H S trans conformation is favored  1000/1 over S cis.

H H O S cis R H H O R H R R C N C H C C R R N C N C C N N C C N N H O H R C O H O H R H H C steric crowding R H N 1,000...... to...... 1 R H R H R H

C C flat flat flat flat flat C C

H R H R

The flat shape of the amide bond limits possible conformations of proteins. 9 Proteins are polymers. Less than 40 amino acids are considered peptides. Their specific spatial conformations are controlled by ionic interactions, hydrogen bonding, dispersion forces and disulfide linkages. The most common ways to determine their 3D structure are X-ray crystallography and NMR spectroscopy. The amino end is referred to as the N-terminus and the carboxyl end is referred to as the C-terminus. Counting residues always starts at the N-terminus + -- (-NH3 ) and finishes at the C-terminus (-CO2 ). The primary structure of a protein is determined by the gene sequence in DNA,astranscribed to the RNA,as tranlated into theprotein (with the possibility of post translational modifications, which cannot be determined from the DNA sequence). Primary protein structure = the linear order of amino acids. serine cysteine OH SH C-terminus N-terminus alanine (#n) (#1) R H H O H O H3C H H O H2C H H2C C H3N C N C C C N C C N C C N N C C N N O

H O H CH2 H H O H CH2 H H Possible variety = (20)n n=1 (20 choices) n=2 (400 choices) n=3 (8,000 choices) phenylalanine tyrosine n=4 (160,000 choices) HO n=100 (20100 choices) 10 Secondary structure refers to highly regular local sub‐structures on the actual polypeptide backbone chain. Two main types of secondary structure, are alpha helices and beta pleated sheets. In 1951 Linus Pauling suggested both alpha helices and beta sheets as a way of maximizing all the hydrogen bond donors and acceptors in the peptide backbone.

11 β‐pltdleated shtheets can be parallel or anti‐parallel .

12 O O

C H3N C NH3 O O

#n#1 #1 #n

#1 #n #1 #n

O O

H3N C H3N C O O

13 Proline is unusual in that both trans and cis conformations are possible. It is referred to as a disrupter of normal protein patterns ( helices and  pleated sheets).

S trans favored over S cis  3/1 in proline (also depends on tensions in the protein chain)

R HN O H H C C O C R H NH O C O N C H C N N C Normally, C N H this is an H 3 1 H S trans S cis O O R1 R C 1 C

crowded N also N R R 3 crowded R2 2 R3 Normally, R2 is an H 14 Tertiary Structure

Tertiary structure refers to the -pleated sheets three‐dimensional structure of monomeric and multimeric protein molecules. The alpha‐ helices and beta pleated‐sheets are flddfolded into a compact globular structures. The folding is partly driven by the non‐specific hydrophobic interactions, the -heliz blburial of hhdydrop hbhobic residues from water, but the structure is stable only when the parts of a protein domain are locked into place by specific tertiary interactions, such as salt bridges, hydrogen bonds, and the tight packing of side chains and disu lfide bdbonds. The disu lfide bonds are extremely rare in cytosolic proteins, since the random strands cytosol (intracellular fluid) is generally a reddiucing environment. NADH can reduce the disulfide bond to 2 thiols. 15 Quaternary structure refers to the three‐dimensional structure of a multi‐subunit protein and how the subunits fit together. The quaternary structure is stabilized by the same non‐covalent interactions and disulfide bonds as the tertiary structure. Complexes of two or more polypeptides are called multimers. Specifically it would be called a dimer if it contains two subunits, a trimer if it contains three subunits, a tetramer if it contains four subunits, etc. The subunits are frequently related to one another by symmetry operations, such as a 2‐fold axis in a dimer. Multimers made up of identical subunits are referred to with a prefix of "homo‐" (e.g. a homotetramer) and those made up of dffdifferent subunits are refdferred to with a prefix of "hetero‐", for example, a heterotetramer, such as the two alpha and two beta chains of hemoglobin.

16 H H H H O N O Forces of interaction (strength) H H N 1. covalent () H H H 2. ionic O O 3. hydrogen bonds hydrogen 4. dispersion forces H bond 4. pi stacking (similar) H H N histidine H threonine H2 C O H hydrogen CH2 H bond Protein arginine H O lysine H C H O 2 N alanine H H H C H leucine CH N 2 3 H3C N isoleucine C ionic CH3 cysteine dispersion HC C H bond H H C forces 2 H3C H O CH dispersion valine 3 C S tyrosine CH CH O H 3 forces H 3 P alanine H C S phosphorylated CH3 3 cysteine O O serine H dispersion O H H C forces O H H methionine ammonium H CH3 2 N ions valine H3C C H S C H2 polar forces are more H dispersion ionic phenylalanine common on the outside forces H bond where they can interact carboxylate N O pi stacking with water (hydrophilic) H ions H C H O phenylalanine arginine H O tryptophan O NH H H nonpolar forces are more common on the O C inside where they can avoid water NH2 (hydrophobic) H2N O H H

17 One possible way disulfide bonds can form involves cytochrome P-450s and oxygen, (previous topic). There are other possibilities too.

S R S R O H O H sulfur O +3 S substrate Fe R (1e-) sulfur Fe +4 substrate H Fe +4 cytochrome (nitrogen too) sulfoxides P-450

H H BH B H B H H O O O O water

S S S R R R R S H H sulfoxides B S S H R R B BHdisulfides Vasopressin has two primary functions in the body: to retain water and to constrict blood vessels. It is synthesized in the hypothalamus and stored in vesicles at the posterior pituitary, where it is released into the bloodstream. It is thought to have an important role in social behavior and has a very short half‐life between 16–24 minutes.

The hormone, oxytocin, is released by the pituitary gland, located in the hypothalamus. The functions of oxytocin include maternal bonding, milk production, uterine contractions during labor, sexual pleasure, reduced fear, and love. 18 Insulin has 3 disulfide bonds (lots of post translational modification)

GLUT4 containing vesicles fuse to the membrane to allow glucose into the cell. 19 Diabetes mellitus (DM), commonly referred to as diabetes, is a group of metabolic diseases in which there are high blood sugar levels over a prolonged period. Symptoms of high blood sugar include frequent urination, increased thirst, and increased hunger. If left untreated, diabetes can cause many complications. Acute complications include diabetic ketoacidosis and nonketotic hyperosmolar coma. Serious long-termcomplications includecardiovascular disease, stroke, chronic kidney failure, foot ulcers, and damage to the eyes.

Diabetes is due to either the pancreas not producing enough insulin or the cells of the body not responding properly to the insulin produced. There are three main types of diabetes mellitus:

Type 1 DM results from the pancreas's failure to produce enough insulin. This form was previously referred to as "insulin-dependent diabetes mellitus" or "juvenile diabetes". The cause is unknown.

Type 2 DM begins with insulin resistance, a condition in which cells fail to respond to insulin properly. As the disease progresses a lack of insulinmayalso develop. This formwaspreviously referred toas"non insulin-dependent diabetes mellitus" or "adult-onset diabetes". The primary cause is excessive body weight and not enough exercise.

Gestational diabetes, is the third main form and occurs when pregnant women without a previous history of diabetes develop high blood-sugar levels.

Prevention and treatment involve a healthy diet, physical exercise, maintaining a normal body weight, and avoiding use of tobacco. Control of blood pressure and maintaining proper foot care are important for people with the disease. Type 1 DM must be managed with insulin injections. Type 2 DM may be treated with medications with or without insulin. Insulin and some oral medications can cause low blood sugar. Weight loss surgery in those with obesity is sometimes an effective measure in those with type 2 DM. Gestational diabetes usually resolves after the birth of the baby.

As of 2015, an estimated 415 million people have diabetes worldwide, with type 2 DM making up about 90% of the cases. This represents 8.3% of theadult population, with equal rates in both women and men. From 2012 to 2015, diabetes is estimated to have resulted in 1.5 to 5.0 million deaths each year. Diabetes at least doubles a person's risk of death. The number of people with diabetes is expected to rise to 592 million by 2035. The global economic cost of diabetes in 2014 was estimated to be $612 billion USD. In the United States, diabetes cost $245 billion in 2012 20 The amino‐acid sequence of a protein determines its native conformation and it folds spontaneously during or after biosynthesis. The process also depends on the solvent (water or lipid bilayer), the concentration of salts, the pH, the temperature, the presence of cofactors and of molecular chaperones.

Minimizing the number of hydrophobic Many, many, many decisions side‐chains exposed to water is an (interactions) lead to localized important driving force in protein minima, which leads to an overall folding (maximizing entropy of water). structure Formation of intramolecular hydrogen bonds is another important contribution to protein stability, more so in a hhdydrop hbihobic core than H‐bdbonds exposed to the aqueous environment.

Chaperone‐assisted folding is often necessary in the crowded intracellular environment.

Aggregated misfolded proteins are associated with prion‐related illnesses such as mad cow disease, amyloid‐ related illnesses such as Alzheimer's disease, Huntington's and Parkinson's disease. 21 Hydrophobic forces can be important in quaternary structures.

(on the outside) Hydrophobic surface faces outside toward lipid bilayer, polar channel on the inside. Hydrophobic surface

Hydrophobic surfaces face towards each other to minimize structuring water molecules. OH2 H O 2 OH2

H2O OH2

H2O OH2

H2O OH2 H2O 22 Post translational modifications of proteins Not protonated at body pH when an amide. N- B H B H H O N protein H CoA protein N CoA S S H B acetyl CoA BH O collagen collagen

collagen collagen collagen collagen N +3 Fe N N +4 +4 Fe Fe Makes stronger interactions with neighbors via H O H O O H bonds. H H H H hydroxyproline  4% of proline amino acids in animal tissue.

carboxylation More acidic H B when acitivated by 2 x C=O. O O O O B H H H protein protein H N N C O H H N N C O O O simplified O simplified O biotin

- turns enzymes on and turns enzymes off. O O serine, threonine or tyrosine O O O OPO P O ADP O P O P O P O ATP O O protein O O O O O Mg+2 H acyl-like substitution B Mg+2 ADP = leaving group ATP reaction protein O P O Can turn on or Does the Mg+2 make ADP a better or poorer leaving group? O turn off an enzyme. 23 Structural proteins in cell division ‐ Microtubules are crucial for cell division.

Paclitaxel is used to treat ovarian, breast, lung, O O OH pancreaticand other cancers. Paclitaxel stabilizes the microtubule polymer and stops it from disassembly, O preventing cell division. Discovered in 1960s in bark O NH O of slow growing Yew tree (> 600 years to grow, 3-6 trees = 1 patient, not sustainable). Precursor later O O H O discovered in needles or ornamental Yew tree. Even HO O OH O later, genes were spliced into bacteria to synthesize precursor.  Taxol O 24 Enzymes as receptors and transporters

Enzymes ‐ life’s catalysts Receptors ‐ life’s communication system Structural Proteins – life’s framework

25