folding hierarchy

1. Primary structure - sequence.

2. Secondary structure, e.g., α helix, β sheet, helix.

3. Tertiary structure - the folding of secondary structure motifs into functional , e.g., the folded myoglobin protein, consisting of α helices and β sheets

4. Quaternary structure - assemblies of polypeptides into multi-subunit protein, e.g., the assembly of four polypeptides into a functional hemoglobin molecule or the assembly of six Rho polypeptides into a functional Rho transcription termination factor.

Globular protein structures

Most often, proteins do not contain repeats of helices (like collagen or keratins) or β sheets (like the silk protein fibrin). Many globular proteins have combinations of these structural motifs. The folding of these structural motifs together, which forms a three dimensional structure, is the tertiary level of .

1. Antiparallel (greek key) β sheet protein

This is jack bean concanavalin A The yellow spheres represent metal ions

1 Globular protein structures

2. parallel β sheet proteins

phosphoglycerate mutase hexokinase domain 1 flavoredoxin

Globular protein structures

3. α/β barrel

This is triose phosphate isomerase. Notice the α helices surrounding the β sheets.

side view top view

2 Globular protein structures

4. Antiparallel α helices 5. Metal or disulfide rich proteins

myohemerythrin

Influenza virus hemagglutinin HA2

Globular protein structures - protein domains

glyceraldehyde-3-phosphate

NAD+ binding domain

Bacillus strearothermophilus glyceraldehyde-3-phosphate dehydrogenase

+ + Glyceraldehyde-3-phosphate + Pi + NAD NADH + 1,3-bisphosphoglycerate + H

3 Globular protein structures - protein domains

•Often, protein consist of functional domains. These domain usually serve different functions that are related to the overall function of the protein. •By definition, protein domains can function without being connected to the complete protein.

Consider bacterial and yeast transcription activators. Each domain can carry out its function in the absence of the other domain

transcription activation domain - binds RNA DNA binding - binds specific polymerase and serves to increase transcription “ ” DNA sequences that are found initiation by bringing the polymerase close to close to promoter sequences the promoter

Why is this beneficial? Proteins are built of functional modules. At some point the DNA binding domain became fused to the transcription activation domain. This increases the concentration of the activation domain near the promoter, thus promoting binding of RNA polymerase to the promoter and an increase in gene expression.

Globular protein structures - quaternary structure of hemoglobin

Polypeptides can bind and form complexes. These complexes primarily are not bound covalently but by definition are stable. Each polypeptide is called a subunit.

β2

The quaternary structure of α2 the 4 subunits of hemoglobin is shown on the right. α1 - yellow α2 - green β1 - cyan β2 - blue heme groups - red

α1 β1 heme group

4 Symmetry of oligomeric proteins

When proteins made of equivalent subunits form complexes, they can be characterized by particular rotational (cyclic) symmetry. The dotted line represents the symmetry axis.

C2 C3

C5

A protein is made of 6 identical subunits. If the six subunits associate in a particular way the protein will have C3 symmetry but not a C6 symmetry. How would the subunits be arranged in order to produce C3 symmetry but not C6 symmetry?

Symmetry of oligomeric proteins

Dihedral symmetry

D2

D4 D3

Tetrahedral (T) Octahedral (O) Icosahedral (I)

5 Forces that stabilize

or, why do proteins stay folded?

The hydrophobic effect. Non-polar side chains tend to minimize their contact with water. This results in increased (here water) entropy (favorable ∆S of solvent) because fewer water molecules are required to solvate the non-polar side chains.

Side chain hydropathy is a way of predicting which residue would be “buried” inside the protein away from water. A positive hydropathy value indicates increased non-polarity and an increased likelihood that the amino acid would be found inside the hydrophobic core of the protein. The most hydropathic residues are isoleucine, valine, and leucine. The least hydropathic are argenine, lysine, asparagine, and aspartate.

Forces that stabilize protein structure

hydropathic index

residue number

For bovine chymotrypsin, the sum of 9 hydropathy values of consecutive residues is plotted against the residue number. The upper bars show regions found in the interior of the protein, and the lower bars indicate residues found on the exterior.

6 Forces that stabilize protein structure

•Non-covalent associations or Van der waal interactions are weak but contribute much stability to the structure of the protein. Many functional groups have permanent dipole moments (carbonyl and amide groups in the polypeptide backbone). Packing forces, or London dispersion forces, are relevant in atoms situated close to each other and further stabilize protein structure.

•Electrostatic interactions or salt bridges between residues of opposite charges, e.g., lysine and aspartate, attract these residues together. This is not a significant contribution because the free energy of solvation of free ions is about equal to the free energy of solvation of ion pairs. Each ion stays solvated and salt bridges do not contribute much stability to the folded protein.

•Hydrogen bonds contribute much stability to α helices and β sheets.

Forces that stabilize protein structure

A semi quantitative treatment of the protein folding problem is given in the textbook.

ΔGtotal = ΔHchain + ΔHsolvent − TΔSchain − TΔSsolvent

∆Schain is not favorable because folding decreases the freedom of the chain to move in solution € ∆Ssolvent is very favorable because of the hydrophobic effect. The water molecules that were used to solvate the hydrophobic residues are now more free to move about in solution. This term is mostly responsible for driving the folding reaction.

∆Hchain is not favorable. For non-polar side chains the interactions in the folded state are weaker than the interactions with water because water induces dipoles within the non-polar side chains and interacts with these induced dipoles. So for non-polar side chains the enthalpy of folding is not favorable. For polar side chains, a similar situation applies, because the interaction with water is stronger than the interactions within the protein.

∆Hsolvent is favorable because folding allows water molecules to interact with each other rather than interact with the polar and non-polar side chains. The interaction of water with the side chains is weaker, so this term is favorable.

7 X-ray crystallography - crystals

A central goal of biochemistry is to understand the relationship between the structure of a macromolecule and its function. To that end, it is imperative to gain an understanding of the three dimensional conformation of the molecule. This often is done by x-ray crystallography.

The crystals of proteins and nucleic acids contain 40-60% water, in contrast to the essentially water-free crystals of most small inorganic and organic molecules. The water molecules are bound to the proteins and nucleic acids and stabilize the structures. Therefore, protein crystals are typically softer and have a jelly-like consistency. Proteins sometimes contain light-absorbing groups, which render the crystals quite colorful.

marine worm myohemerythrin bacteriochlorophyll A protein

X-ray crystallography - diffraction •X-ray beams have a wavelength of ~1.5 Å, so they are well-suited for the determination of the three dimensional arrangement of atoms within a macromolecule (the uncertainty of locating an object using a radiation is approximately equal to the wavelength of the radiation). •A crystal of the macromolecule is bombarded with X-ray beams and the DIFRACTION pattern is recorded on a photographic film or by a radiation counter. •X-ray “particles” primarily interact with electrons, not nuclei. Therefore, the intensity of each diffraction maximum (the darkness of the spot) is proportional to the electron density of the crystal. •On the right the film shows the diffraction of sperm whale myoglobin.

8 X-ray crystallography - electron density map calculation

Using the diffraction data an electron density map is calculated. The electron density map shows the three dimensional arrangements of electrons within the crystal.

One can draw a number of electrons density maps on transparent sheets, each map representing a “slice” through the crystal. These maps can then be superimposed to give a three dimensional representation of the the electron density.

A more modern approach is to use computer graphics programs that calculate three dimensional electron density maps.

What are some of the caveats inherent in the X-ray crystallography approach?

X-ray crystallography - resolution •Because protein crystals contain many water molecules, the crystal typically is not as rigid as a NaCl crystal. There is a small amount of disorder within the crystal, consequently there is no information available for some of small functional groups. The crystal is said to have a resolution limit of that size. As the distance that can be distinguished by the electron density map is decreased, the resolution is increased. •The resolution also represents the quality of the crystal. A high resolution indicates a high- quality crystal. •Most side chains cannot be ascertained by X-ray crystallography because the resolution of the crystals is too low. Nevertheless, the side chains can be “filled in” if the sequence of the polypeptide is known (which nowadays it usually is) and the backbone has a reasonable set of constraints. •Example: a section through the electron density of diketopiperazine calculated at the indicated resolution levels (from Voet and Voet, Biochemistry, 2nd edition, page 164).

9 Nuclear magnetic resonance

•NMR can be used to determine the structure of polypeptides (currently up to 100-120 aa) and small RNAs. This technique is useful as it does not rely on crystal formation and provides the structural information in solution. •We will not discuss in detail the principles of NMR. Briefly, the technique relies on the alignment of protons within a powerful magnet and and their relaxation times. Neighboring groups affect the relaxation times, and this effect can be quantified such that a 3-D map can be drawn. •Several structures that fit the data can be drawn and are represented below.

This is a trace of hemoglobin back bone from an NMR data set. Why are there several solutions to the data set?

10