2 "The man who does not read good Structure & Function books has no advantage over the man who cannot read them." Mark Twain

Introduction “It is structure that we look for, whenever we When we study life at the molecular level, it try to understand anything. All science is becomes apparent that the structures of bio- built upon this search. We like to understand, logical molecules are inseparable from their and to explain, observed facts in terms of functions. The molecular interactions that un- structure. derlie life are dependent on the structures of - Linus Pauling the molecules we are made of.

52 In this chapter, we will examine the structures nucleic acids, carbohydrates and lipids. of the major classes of biomolecules, with an The first three of these major groups are mac- eye to understanding how these structures re- romolecules that are built as long polymers late to function. made up of smaller subunits or monomers, like strings of beads. The lipids, while not chains of monomers, also have smaller subunits that are assem- bled in various ways to make the lipid components of cells, includ- ing membranes. The chemical properties and three dimen- sional conformations of these molecules determine all the mo- lecular interactions upon which life depends. Whether building structures within cells, transfer- ring information, or catalyzing reactions, the activities of biomo- lecules are governed by their Interactive 2.1 - The enzyme Hexokinase: as for all structures. The properties and enzymes, the activity of hexokinase depends on its shapes of macromolecules, in structure. Protein Database (PDB) turn, depend on the subunits of which they are built. Biological molecules As noted earlier, water is the most abundant Building blocks molecule in cells, and provides the aqueous We will next examine the major groups of bio- environment in which cellular chemistry hap- logical macromolecules: proteins, polysac- pens. Dissolved in this water are inorganic charides, nucleic acids, and lipids. The ions like sodium, potassium and calcium. But building blocks of the first three, respectively, the distinctiveness of biochemistry derives are amino acids, monosaccharides (sug- from the vast numbers of complex, large, car- ars), and nucleotides. Acetyl-CoA is the bon compounds, that are made by living cells. most common building block of lipids. You have probably learned that the major classes of biological molecules are proteins,

53 Structure & Function: Amino Acids

"It is one of the more striking generalizations the vast assortment of proteins found in all liv- of biochemistry ...that the twenty amino ac- ing cells. ids and the four bases, are, with minor reser- vations, the same throughout Nature." All amino acids have the same basic structure, which is shown in Figure 2.1. At the “center” - Francis Crick of each is a carbon called the α carbon and attached to it are four Building blocks of protein groups - a hydrogen, an α- YouTube Lectures All of the proteins on the face of by Kevin carboxyl group, an α-amine the earth are made up of the HERE & HERE group, and an R-group, some- same 20 amino acids. Linked to- times referred to as a side chain. gether in long chains called polypep- The α carbon, carboxyl, and amino tides, amino acids are the building blocks for groups are common to all amino acids, so the

54 lation. When this happens, these unusual amino acids can be incorporated into pro- teins. Enzymes containing selenocysteine, for example, include glutathione peroxi- dases, tetraiodothyronine 5' deiodinases, thioredoxin re- ductases, formate dehydroge- nases, glycine reductases, and selenophosphate synthetase. -containing pro- teins are much rarer and are Figure 2.1 - General amino acid structure mostly confined to archaea.

R-group is the only unique feature in each amino acid. (A minor exception to this structure is that of proline, in which the end of the R-group is attached to the α-amine.) With the exception of glycine, which has an R-group consisting of a hy- drogen atom, all of the amino acids in pro- teins have four different groups attached to them and consequently can exist in two mirror image forms, L and D. With only very minor exceptions, every amino acid found in cells and in proteins is in the L configuration.

There are 22 amino acids that are found in proteins and of these, only 20 are specified by the universal genetic code. The oth- ers, selenocysteine and pyrrolysine use tRNAs that are able to base pair with Table 2.1 - Essential and non-essential amino acids stop codons in the mRNA during trans-

55 Non-Polar Carboxyl Amine Aromatic Hydroxyl Other

Alanine Aspartic Acid Arginine Phenylalanine Serine Asparagine

Glycine Glutamic Acid Histidine Tryptophan Threonine Cysteine

Isoleucine Tyrosine Tyrosine Glutamine

Leucine Selenocysteine

Methionine Pyrrolysine

Proline

Valine Table 2.2 - Amino acid categories (based on R-group properties)

Essential and non-essential Non-protein amino acids Nutritionists divide amino acids into two There are also α-amino acids found in cells groups - essential amino acids (must be in that are not incorporated into proteins. Com- the diet because cells can’t synthesize them) mon ones include ornithine and citrulline. and non-essential amino acids (can be Both of these compounds are intermediates in made by cells). This classification of amino the urea cycle. Ornithine is a metabolic acids has little to do with the structure of precursor of arginine and citrulline can be amino acids. Essential amino acids vary con- produced by the breakdown of arginine. The siderable from one organism to another and latter reaction produces nitric oxide, an im- even differ in humans, depending on whether portant signaling molecule. Citrulline is the they are adults or children. Table 2.1 shows metabolic byproduct. It is sometimes used as essential and non-essential amino acids in hu- a dietary supplement to reduce muscle fa- mans. tigue.

Some amino acids that are normally non- R-group chemistry essential, may need to be obtained from the We separate the amino acids into categories diet in certain cases. Individuals who do not based on the chemistry of their R-groups. If synthesize sufficient amounts of arginine, you compare groupings of amino acids in cysteine, glutamine, proline, selenocys- different textbooks, you will see different teine, serine, and tyrosine, due to illness, names for the categories and (sometimes) the for example, may need dietary supplements same amino acid being categorized differently containing these amino acids. by different authors. Indeed, we categorize ty-

56 Figure 2.2 - Amino acid side chain properties Wikipedia rosine both as an aromatic amino acid and ond only to leucine in occurrence. A D-form as a hydroxyl amino acid. It is useful to clas- of the amino acid is also found in bacterial cell sify amino acids based on their R-groups, be- walls. Alanine is non-essential, being read- cause it is these side chains that give each ily synthesized from pyruvate. It is coded amino acid its characteristic prop- for by GCU, GCC, GCA, and GCG. HERE erties. Thus, amino acids with YouTube Lectures Wikipedia link . (chemically) similar side by Kevin groups can be expected to func- HERE & HERE Glycine (Gly/G) is the amino tion in similar ways, for example, acid with the shortest side chain, hav- during protein folding. ing an R-group consistent only of a single hydrogen. As a result, glycine is the only Non-polar amino acids amino acid that is not chiral. Its small side Alanine (Ala/A) is one of the most abundant chain allows it to readily fit into both hydro- amino acids found in proteins, ranking sec- phobic and hydrophilic environments.

57 Glycine is specified in the genetic code by Leucine is encoded by six codons: GGU, GGC, GGA, and GGG. It is non- UUA,UUG, CUU, CUC, CUA, CUG. Wikipedia essential to humans. Wikipedia link HERE. link HERE.

Isoleucine (Ile/I) is an essential amino Methionine (Met/M) is an essential acid encoded by AUU, AUC, and AUA. It has amino acid that is one of two sulfur- a hydrophobic side chain and is also chiral containing amino acids - cysteine is the in its side chain. Wikipedia link HERE. other. Methionine is non-polar and en- coded solely by the AUG codon. It is the “ini- Leucine (Leu/L) is a branched-chain amino tiator” amino acid in protein synthesis, be- acid that is hydrophobic and essential. ing the first one incorporated into protein chains. In prokary- otic cells, the first me- thionine in a protein is formy- lated. Wikipedia link HERE.

Proline (Pro/P) is the only amino acid found in pro- Figure 2.3 - Non-polar amino acids teins with an R-group Leucine is the only dietary amino acid re- that joins with its own α-amino group, mak- ported to directly stimulate protein synthe- ing a secondary amine and a ring. Proline sis in muscle, but caution is in order, as 1) is a non-essential amino acid and is coded there are conflicting studies and 2) leucine tox- by CCU, CCC, CCA, and CCG. It is the least icity is dangerous, resulting in "the four D's": flexible of the protein amino acids and thus diarrhea, dermatitis, dementia and death . gives conformational rigidity when present in

58 a protein. Proline’s presence in a protein af- netic code by the codons GAU and GAC. fects its secondary structure. It is a dis- Wikipedia link HERE. rupter of α-helices and β-strands. Proline is often hydroxylated in (the reac- Glutamic acid (Glu/E), which is coded by tion requires Vitamin C - ascorbate) and this GAA and GAG, is a non-essential amino has the effect of increasing the protein’s con- acid readily made by transamination of α- formational stability. Proline ketoglutarate. It is a neurotransmitter of hypoxia-inducible factor (HIF) serves and has an R-group with a carboxyl group as a sensor of oxygen levels and targets HIF that readily ionizes (pKa = 4.1) at physiologi- for destruction when oxygen is plentiful. cal pH. Wikipedia link HERE. Wikipedia link HERE. Amine amino acids Valine (Val/V) is an essential, non-polar Arginine (Arg/R) is an amino acid that is, in amino acid synthesized in plants. It is note- some cases, essential, but non-essential in oth- worthy in hemoglobin, for when it replaces ers. Premature infants cannot synthesize ar- glutamic acid at position number six, it ginine. In addition, surgical trauma, sepsis, causes hemoglobin to aggregate abnormally and burns increase demand for arginine. under low oxygen conditions, resulting in Most people, however, do not need arginine sickle cell disease. Valine is coded in the supplements. Arginine’s side chain contains a genetic code by GUU, GUC, GUA, and GUG. complex guanidinium group with a pKa of Wikipedia link HERE. over 12, making it positively charged at cellu- lar pH. It is coded for by six codons - CGU, Carboxyl amino acids CGC, CGA, CGG, AGA, and AGG. Wikipedia Aspartic acid (Asp/D) is a non-essential link HERE. amino acid with a carboxyl group in its R- group. It is readily produced by trans- amination of ox- aloacetate. With a pKa of 3.9, aspartic acid’s side chain is negatively charged at physiological pH. Aspartic acid is Figure 2.4 - Carboxyl amino acids specified in the ge-

59 Figure 2.5 - Amine amino acids

Histidine (His/H) is the only one of the pro- and is necessary for optimizing growth of pigs teinaceous amino acids to contain an imida- and chickens. Wikipedia link HERE. zole functional group. It is an essential amino acid in humans and other mammals. Aromatic amino acids

With a side chain pKa of 6, it can easily have Phenylalanine (Phe/ F) is a non-polar, es- its charge changed by a slight change in pH. sential amino acid coded by UUU and Protonation of the ring results in two NH UUC. It is a metabolic precursor of tyrosine. structures which can be drawn as two equally Inability to metabolize phenylalanine arises important resonant structures. Wikipedia from the genetic disorder known as phen- link HERE. ylketonuria. Phenylalanine is a component of the aspartame artificial sweetener. Lysine (Lys/K) is an essential amino acid Wikipedia link HERE. encoded by AAA and AAG. It has an R- group that can readily ionize with a charge Tryptophan (Trp/W) is an essential of +1 at physiological pH and can be post- amino acid containing an indole functional translationally modified to form acetyl- group. It is a metabolic precursor of sero- lysine, hydroxylysine, and methyllysine. tonin, niacin, and (in plants) the auxin It can also be ubiquitinated, sumoylated, phytohormone. Though reputed to serve as neddylated, biotinylated, carboxylated, a sleep aid, there are no clear research results and pupylated, and. O- of hy- indicating this. Wikipedia link HERE. droxylysine is used to flag proteins for ex- Tyrosine (Tyr/Y) is a non-essential port from the cell. Lysine is often added to amino acid coded by UAC and UAU. It is a animal feed because it is a limiting amino acid target for phosphorylation in proteins by tyrosine protein kinases and plays a role

60 three amino acids having an R-group with a hydroxyl in it (threonine and tyrosine are the others). It is coded by UCU, UCC, UCA, UGC, AGU, and AGC. Being able to hydrogen bond with wa- ter, it is classified as a polar amino acid. It is not essential for humans. Serine is precursor of many important cellular compounds, including purines, py- rimidines, sphingolipids, folate, and of the amino acids glycine, cysteine, and tryptophan. The hydroxyl group of serine in proteins is a target for phos- phorylation by certain protein ki- nases. Serine is also a part of the cata- Figure 2.6 - Aromatic amino acids lytic triad of serine proteases. Wikipedia link HERE. in signaling processes. In dopaminergic cells of the brain, tyrosine hy- droxylase converts tyro- sine to l-dopa, an immedi- ate precursor of dopa- mine. Dopamine, in turn, is a precursor of norepi- nephrine and epineph- rine. Tyrosine is also a pre- cursor of thyroid hor- mones and melanin. Wikipedia link HERE.

Hydroxyl amino acids Figure 2.7 - Hydroxyl amino acids Serine (Ser/S) is one of

61 Figure 2.8 - Amino acid properties Wikipedia

Threonine (Thr/T) is a polar amino acid ids bearing a hydroxyl group (serine and ty- that is essential. It is one of three amino ac- rosine are the others) and, as such, is a tar-

62 get for phos- phorylation in proteins. It is also a target for O- glycosylation of proteins. Threonine proteases use the hydroxyl group of the amino acid in their catalysis and it is a pre- cursor in one Figure 2.9 - Other amino acids biosynthetic pathway for with an amine from glutamine. Breakdown making glycine. In some applications, it is of asparagine produces malate, which can be used as a pro-drug to increase brain glycine oxidized in the citric acid cycle. Wikipedia levels. Threonine is encoded in the link HERE. genetic code by ACU, ACC, YouTube Lectures Cysteine (Cys/C) is the only ACA, and ACG. Wikipedia link by Kevin HERE. HERE & HERE amino acid with a sulfhydryl group in its side chain. It is non- Tyrosine - see HERE. essential for most humans, but may be essential in infants, the elderly and individu- Other amino acids als who suffer from certain metabolic dis- Asparagine (Asn/N) is a non-essential eases. Cysteine’s sulfhydryl group is readily amino acid coded by AAU and AAC. Its car- oxidized to a disulfide when reacted with boxyamide in the R-group gives it polarity. another one. In addition to being found in Asparagine is implicated in formation of ac- proteins, cysteine is also a component of the rylamide in foods cooked at high tempera- tripeptide, glutathione. Cysteine is specified tures (deep frying) when it reacts with car- by the codons UGU and UGC. Wikipedia link bonyl groups. Asparagine can be made in the HERE. body from aspartate by an amidation reaction

63 Glutamine (Gln/Q) is an amino acid that Like selenocysteine, it is not coded for in the is not normally essential in humans, but may genetic code and must be incorporated by be in individuals undergoing intensive athletic unusual means. This occurs at UAG stop co- training or with gastrointestinal disorders. It dons. Pyrrolysine is found in methano- has a carboxyamide side chain which does genic archaean organisms and at least one not normally ionize under physiological methane-producing bacterium. Pyrrolysine pHs, but which gives polarity to the side is a component of methane-producing en- chain. Glutamine is coded for by CAA and zymes. Wikipedia link HERE. CAG and is readily made by amidation of glu- tamate. Glutamine is the most abundant Ionizing groups amino acid in circulating blood and is one of pKa values for amino acid side chains are only a few amino acids that can cross the very dependent upon the chemical environ- blood-brain barrier. Wikipedia link ment in which they are present. For example, HERE. the R-group carboxyl found in aspartic

acid has a pKa value of 3.9 when free in solu- Selenocysteine (Sec/U) is a component of tion, but can be as high as 14 when in certain selenoproteins found in all kingdoms of environments inside of proteins, though that life. It is a component in several enzymes, is unusual and extreme. Each amino acid has including glutathione peroxidases and thi- at least one ionizable amine group (α- oredoxin reductases. Selenocysteine is in- amine) and one ionizable carboxyl group (α- corporated into proteins in an unusual carboxyl). When these are bound in a pep- scheme involving the stop codon UGA. Cells tide bond, they no longer ionize. Some, but grown in the absence of selenium terminate not all amino acids have R-groups that can protein synthesis at UGAs. However, when ionize. The charge of a protein then arises selenium is present, certain mRNAs which from the charges of the α-amine group, the α- contain a selenocysteine insertion sequence carboxyl group. and the sum of the charges of (SECIS), insert selenocysteine when UGA the ionized R-groups. Titration/ionization of is encountered. The SECIS element has char- aspartic acid is depicted in Figure 2.10. Ioni- acteristic nucleotide sequences and secon- zation (or deionization) within a protein’s dary structure base-pairing patterns. Twenty structure can have significant effect on the five human proteins contain selenocysteine. overall conformation of the protein and, since Wikipedia link HERE. structure is related to function, a major im- pact on the activity of a protein. Most pro- Pyrrolysine (Pyl/O) is a twenty second teins have relatively narrow ranges of optimal amino acid, but is rarely found in proteins.

64 R-groups; or 4) other functional groups (such as sulfates or phos- phates) added to amino acids post- translationally - see below.

Carnitine Not all amino acids in a cell are found in pro- teins. The most com- mon examples include ornithine (arginine me- tabolism), citrulline (urea cycle), and car- nitine (Figure 2.12). When fatty acids des- Figure 2.10 - Titration curve for aspartic acid Image by Penelope Irving tined for oxidation are moved into the mito- activity that typically correspond to the envi- chondrion for that purpose, they travel ronments in which they are found (Figure across the inner membrane attached to car- 2.11). It is worth noting that formation of pep- nitine. Of the two stereoisomeric forms, tide bonds between amino acids removes the L form is the active one. The molecule is ionizable hydrogens from both the α- amine and α- carboxyl groups of amino acids. Thus, ionization/ deionization in a protein arises only from 1) the amino terminus; 2) car- Figure 2.11 - Enzyme activity changes as pH changes boxyl terminus; 3) Image by Aleia Kim

65 Catabolism of amino acids We categorize amino acids as essential or non-essential based on whether or not an organism can synthesize them. All of the amino acids, however, can be bro- ken down by all organisms. They are, in fact, a source of energy for cells, particu- larly during times of starvation or for peo- Figure 2.12 - L-Carnitine ple on diets containing very low amounts of carbohydrate. From a perspective of synthesized in the liver from lysine and me- breakdown (catabolism), amino acids thionine. From exogenous sources, fatty ac- are categorized as glucogenic if they produce ids must be activated upon entry into the cyto- intermediates that can be made into glucose plasm by being joined to coenzyme A. The CoA portion of the molecule is replaced by carnitine in the intermembrane space of the mitochondrion in a reaction catalyzed by car- nitine acyltransferase I. The resulting acyl- carnitine molecule is trans- ferred across the inner mi- tochondrial membrane by the carnitine- acylcarnitine translo- case and then in the ma- trix of the mitochon- drion, carnitine acyl- transferase II replaces the carnitine with coen- Figure 2.13 - Catabolism of amino acids. Some have more zyme A (Figure 6.88). than one path. Image by Pehr Jacobson

66 or ketogenic if their intermediates are made at all. In addition, N-linked- and O-linked- into acetyl-CoA. Figure 2.13 shows the glycoproteins have carbohydrates cova- metabolic fates of catabolism of each of the lently attached to side chains of asparagine amino acids. Note that some amino acids are and threonine or serine, respectively. both glucogenic and ke- togenic.

Post-translational modifications After a protein is synthe- sized, amino acid side chains within it can be chemically modified, giv- ing rise to more diversity of structure and function (Figure 2.14). Common alterations include phos- phorylation of hy- droxyl groups of serine, threonine, or tyrosine. Lysine, proline, and his- tidine can have hydrox- yls added to amines in their R-groups. Other modifications to amino ac- Figure 2.14 - Post-translationally modified amino acids. Modifications shown in green. ids in proteins include ad- Image by Penelope Irving dition of fatty acids Some amino acids are precursors of important (myristic acid or palmitic acid), isopren- compounds in the body. Examples include oid groups, acetyl groups, methyl groups, epinephrine, thyroid hormones, L- iodine, carboxyl groups, or sulfates. dopa, and dopamine (all from tyrosine), These can have the effects of ionization (ad- serotonin (from tryptophan), and hista- dition of phosphates/sulfates), deionization mine (from histidine). (addition of acetyl group to the R-group amine of lysine), or have no effect on charge

67 Figure 2.15 - Phosphorylated amino acids

Building polypeptides chain made up of just a few amino ac- YouTube Lectures ids linked together is called an oli- Although amino acids serve by Kevin other functions in cells, their HERE & HERE gopeptide (oligo=few) while a typi- most important role is as con- stituents of proteins. Pro- teins, as we noted earlier, are polymers of amino acids.

Amino acids are linked to each other by peptide bonds, in which the carboxyl group of one amino acid is joined to the amino group of the next, with the loss of a molecule of water. Additional amino acids are added in the same way, by for- mation of peptide bonds be- tween the free carboxyl on the end of the growing chain and the amino group of the next Figure 2.16 Formation of a peptide bond amino acid in the sequence. A

68 cal protein, which is made up of many amino acids is called a polypeptide (poly=many). The end of the peptide that has a free amino group is called the N-terminus (for NH2), while the end with the free carboxyl is termed the C-terminus (for carboxyl).

As we’ve noted before, function is dependent on structure, and the string of amino acids must fold into a specific 3-D shape, or confor- mation, in order to make a functional protein. The folding of polypeptides into their func- tional forms is the topic of the next section.

69 Graphic images in this book were products of the work of several talented students. Links to their Web pages are below

Click HERE for Click HERE for Aleia Kim’s Pehr Jacobson’s Web Page Web Page

Click HERE for Click HERE for Penelope Irving’s Martha Baker’s Web Page Web Page

Problem set related to this section HERE Point by Point summary of this section HERE To get a certificate for mastering this section of the book, click HERE Kevin Ahern’s free iTunes U Courses - Basic / Med School / Advanced Biochemistry Free & Easy (our other book) HERE / Facebook Page Kevin and Indira’s Guide to Getting into Medical School - iTunes U Course / Book To see Kevin Ahern’s OSU ecampus courses - BB 350 / BB 450 / BB 451 To register for Kevin Ahern’s OSU ecampus courses - BB 350 / BB 450 / BB 451 Biochemistry Free For All Facebook Page (please like us) Kevin Ahern’s Web Page / Facebook Page / Taralyn Tan’s Web Page Kevin Ahern’s free downloads HERE OSU’s Biochemistry/Biophysics program HERE OSU’s College of Science HERE Oregon State University HERE Email Kevin Ahern / Indira Rajagopal / Taralyn Tan The Amino Alphabet Song To the tune of “Twinkle, Twinkle Little Star” Metabolic Melodies Website HERE

Lysine, arginine and his Basic ones you should not miss Ala, leu, val, ile, and met Fill the aliphatic set Proline bends and cys has ‘s’ Glycine’s ‘R’ is the smallest Then there’s trp and tyr and phe Structured aromatically

Asp and glu’s side chains of R Say to protons “au revoir” Glutamine, asparagine Bear carboxamide amines Threonine and tiny ser Have hydroxyl groups to share These twen-TY amino A’s

Recorded by David Simmons Lyrics by Kevin Ahern You’re Cysteine To the tune of “You’re Sixteen” Metabolic Melodies Website HERE

You’re in every protein That I’ve ever seen There’s no need to debate You’re cysteine, a building block, and you’re great

You give the hair waves It’s something we crave Makes the food on my plate You’re cysteine, a building block, and you’re great

Bridge That sulfhydryl in your chain Can oxidize, but I don’t complain It gives support to all peptides So proteins need to have disulfides

You’re an acid it seems And have an amine Please don’t ever mutate You’re cysteine, a building block, and you’re great

Bridge A U-G-U or U-G-C In secret code at the cell's decree You’re in my skin and in my bone And even in my glutathione

You’re in every protein That I’ve ever seen There’s no need to debate You’re cysteine, a building block, and you’re great

Lyrics by Kevin Ahern No Recording Yet For This Song