JBB2026 Fall 2020 Simon Sharpe Protein Structure • peptide conformations and residue preferences • elements of secondary structure • supersecondary structure and motifs • packing of helices and sheets • chain topologies • internal packing • protein interfaces • membrane proteins • multimeric proteins • domain motions Folded Disordered Tyr Thr Gly Cys Ile Ile Ala Gly The structure (conformer) defined by the dihedral angles main chain (φ,ψ) side chains (�1, �2, …)
φ =180 ; ψ=180
φ =-60 ; ψ=-45 The structure (conformer) defined by the dihedral angles main chain (φ,ψ) side chains (�1, �2, …)
The minimum energy conformer of the polymer is determined largely by the non-local interactions between the side chains.
D. Goodsell One way of categorizing the 20 amino acids - each amino acid has particular characteristics Amino acid hydrophobicity Protein conformations
The conformation is the arrangement of the atoms in 3D space. The most stable conformation is at a potential energy minimum
Proper treatment is quantum mechanical - but this is intractable with proteins (too many atoms). We use Newtonian mechanics and describe the system as a set of potential energy terms, each with a particular form.
The overall potential energy can be broken down into a set of energy functions:
LOCAL bond length (1,2) (strong) bond angle (1,3) (strong) dihedral angle (1,4) (medium)
NON-LOCAL solvation / hydrophobic effect van der Waals (packing, steric clashes) electrostatics (incl. hydrogen bonds)
conformational entropy
These are additive: calculate the overall potential energy as the sum of these individual functions Non-local molecular interactions
attractive or distance Type repulsive? dependance
Basis for van der Short Range Pauli exclusion Repulsive 1/r12 Waals radii of atoms * Electrostatic either; depends on Coulomb’s law 1/r (charge-charge) q1q2
either; depends of the involves polar Dipole-dipole Keesom interactions directions of the dipole 1/r3 molecules/groups moments
polarization: Dipole-induced dipole Debye interactions attractive 1/r4 change in a dipole due to an external electronic field.
Charge-dipole attractive
London dispersive resonant induced Fluctuating dipoles attractive 1/r6 interactions dipoles *
often treated as Hydrogen bond attractive electrostatic
similar to charge- Cation-pi attractive induced dipole
Hydrophobic effect
Loren Williams website! http://ww2.chemistry.gatech.edu/~lw26/structure/molecular_interactions/mol_int.html! *Lennard-Jones “6-12” potential! The peptide bond
Delocalized electrons over peptide bond: (1) Increased polarity - gives rise to dipole moment represented by arrow (2) Partial double bond - O-C-N bonds coplanar and rotation is limited
Conformational energy of butane as a function of the central torsion angle Boltzman distribution: Populate according to energies 4 0 3 9 3 8 3 7 3 6 potential energy (kcal/mol) 35 0 90 180 270 360 e g+ t g- e Torsion angle (°) Population: 0% 15% 70% 15% 0% ω ισ
φ ψ ω φ ψ
Note: unsaturated C-N bond length is 1.45 Å - Peptide bond has ~40% double bond character - dihedral is constrained. ω is constrained to ~180°
Bond lengths Bond angles Dihedrals
sp2-hybridized atoms Shorter 120° (flat) More restrained (trans) sp3-hybridized atoms longer 109° (tetrahedral, Less restrained often chiral) (gauche-, gauche+, trans) Geometry of the peptide bond Potential energy curve for the peptide omega dihedral angle :
Barriers for dihedral angle rotation can be attributed to:
-the exchange interaction of electrons in adjacent bonds
-repulsive interactions between overlapping bond orbitals
-- steric clashes between atoms (Clash between groups 1 and 4 in the 1,4 bond disfavors cis).
Note : only two minima here Occurrence of omega in cis: (Stewart et al 1990)
X-X 0.36% (116/32,539) X-Pro 6.5% Ser-Pro 11% Tyr-Pro 25%
Xaa-Pro is an exception (peptide bond preceding a proline) - lower barrier to interconversion - only ~2 kcal/mol energy difference between cis and trans omega bond - but slow interconversion (usu. needs to be catalyzed to equilibrate) - ~6% of Xaa-Pro have cis omega angles, otherwise v. rare (<0.5%) Ramachandran plot of φ, ψ angles in proteins
Main Chain Calculated energy surface “Classic” Ramachandran Plot (theoretical) Based on hard sphere potentials (sum off vdw radii; simple form of the L-J potential) Ramachandran plot of φ, ψ angles in proteins
Calculated energy surface Observed distribution of (phi,psi) (theoretical) in protein structures from the PDB (experimental) Conformationally unusual residues
O
Cysteine
Cys or C Cys-Cys
Ramachandran plot for Xaa – Pro
any residue preceding Pro Proline
Pre-Pro - φ is relatively normal Proline φ “fixed” at -60° ψ is restricted to 90 - 180° ψ = -55° or 145° Residue-specific Ramachandran plots Amino acid side chain torsion angles (χn)
Torsion angle definition - shown here for arginine Amino acid side chain torsion angles (χn)
The different conformations of the side chain as a function of χ1 are referred to as gauche(+), trans and gauche(-). These are indicated in the diagrams below in which the amino acid is viewed along the Cβ-Cα bond.
Most common least common
χ1=-60° χ1=180° χ1=+60° Amino acid side chain torsion angles (χn)
χ1= -60° χ1= 180° χ1=+60°
Cγ1 Cγ2
β- branched residues eg. Val Potential surfaces for side chain dihedral angles This defines the major rotamers for each amino acid.
Ε.γ. χ 1 χ 2 plot for Phenylalanine
gauche-
χ 2 χ2 χ1 gauche+ φ ψ
χ 1 Why can’t Phe χ 2 be trans?
gauche- gauche+ trans Potential surfaces for side chain dihedral angles
gauche+ χ2
Chi-2 χ1
gauche- Cα
Chi-1 Cδ1 Cδ2
gauche- gauche+ trans Cγ is sp2 hybridized Ch1 - Chi2 plots Three rules for secondary structure
1) Local “bonded” potentials must be minimized - bond lengths (1,2) - bond angles (1,3) - dihedrals (1,4) (Ramachandran) regular: all (phi,psi) the same
2) Satisfy main chain hydrogen bonding - Typically, >90% of the potential backbone hydrogen bond donors and acceptors are involved in hydrogen bonds
3) No unfavourable steric interactions - Ramachandran Types of Secondary Structure:
Helices
α, 310, Pi, poly-proline II
β- Sheets parallel antiparallel
Beta Bulges
Turns / hairpins
Specific residue preferences φ,ψ,χ for particular amino acids i.e. how side chains affect the above
‘allowed’ regions of the Ramachandran plot
Π helix Name Frequency* φ (°) ψ (°) n d (Å) H-bonding
310 helix ~4% -74 -4 +3.0 2 i,i+3
α helix ~35% -57 -47 +3.6 1.5 i,i+4
αL helix - +57 +47 -3.6 1.5 i,i+4
Π helix - -57 -70 +4.3 1.1 i,i+5
Collagen (PP type II) Fibres -78 +149 -3.3 2.9 planar β-sheet (para) - -115 +115 2 3.2 twisted β- sheet parallel ~25% -120 +135 -2.3 3.3 interstrand twisted β- sheet antiparallel ~8 -139 +135 -2.3 3.3 interstrand
B-DNA - - 10 3.4 interchain *Crude estimate in globular proteins Helical Structures
_
Translation per residue (in Å) = rise per residue (d)
Helical wheel representation - every 4th residue clusters (1,5,8 etc) Allow definition of amphipathic helix - i.e. one face polar the other non-polar C
Alpha helix Hydrogen bond between carbonyl of residue i with amide-H of residue i + 4
i i+1 i+2 i+3 i+4 i+5 i+6 i+7
N i,i+3 i,i+4 i,i+5
n: residues per turn 3 α 10 R π d: rise per residue (n,d): (3.0, 2.0 Å) (3.6, 1.5 Å) (4.3, 1.1 Å)
Amphipathic helices
100° /residue Long helices - rarely straight.
smooth bends (e.g. tropomyosin - coiled coiled dimers) kinks waters often bridge i,i+4 H-bond. membrane proteins often amphipapthic - one face interacting with bulk solvent, one with protein core. lots of strains due to longer-range contacts the proteins (non-local effects). Transmembrane helices
60 Å radius of curvature: bending is not energetically expensive (< 2 kcal/mol for a 5-turn helix) Saposin A kink in alpha3
closed form open form (ligand bound) (apo)
Ahn et al. , PNAS (2003) Y54 (n)
Ahn et al. , Protein Science (2006) Polyproline II helix
The PPII helix is defined by (φ,ψ) backbone dihedral angles of roughly (-75°, 150°) and TRANS isomers of the peptide bonds.
Top view of a twenty- residue poly-Pro II helix, showing the three-fold symmetry. Left handed helix 3 residues /turn, 3.1Å rise/residue No internal hydrogen bonds - no H-bond donor in proline poly-Pro II helix, showing its openness and lack of internal important in binding of peptides to SH3 domains hydrogen bonding. Polyproline II helix
PPII structures are binding targets for SH3 signalling domains Polyproline I helix
The PPI helix is defined by (φ,ψ) backbone dihedral angles of roughly (-75°, 160°) and CIS isomers of the peptide bonds.
Rarely found because the cis isomer is higher in energy than the trans.
Right handed helix 3.3 residues /turn, 1.9Å rise/residue Beta-sheets
R1 R3 Parallel R R 2 4 • Generally buried • Less twisted
Antiparallel • Generally, one side exposed, other side buried. • Can withstand greater distortions (twisting and beta- bulges)
Pitch = 7.0 Å MORE STABLE
Antiparallel beta sheet Parallel beta sheet Twist of a mixed β -sheet
Thioredoxin (1TRX) The β - bulge
A motif of three residues within a beta-sheet in which the main chains of two consecutive residues are H-bonded to that of the third, and in which the dihedral angles are as follows:
Disrupts normal alternation of side chain direction Accentuates the twist of the sheet. Chain Topologies (tertiary structure)
Huge variety some are regular (e.g. TIM barrel; β-barrel), some are not.
Proteins often assembled from “domains”
“Never” see knots
Minimum size for stability ~60 amino acids if small - often stabilized by disulfides, co-factors, etc e.g. Zinc fingers
If extracellular, often have - S-S bonds. - glycosylation
Salt bridges are not very common Tertiary structure or chain topologies
Huge variety Typically not regular Domain defined as a compact unit of protein structure that is usually capable of folding stably as a independent entity in solution.
Do not have to comprise a contiguous segment of peptide chain - but this is frequently the case.
Alanine racemase has one structural domain interrupted by a second Tertiary structure
Number of folds is large but limited - therefore used in different combinations to create diversity:
(a) Tryptophan synthase (b) galactonate dehydratase
Both contain an alpha-beta barrel (yellow) Internal Packing of a folded protein
• Inside of a protein is packed as tightly as in an organic crystal - largely driven by the hydrophobic effect and van der Waals packing (also electrostatics, etc.)
• position of the side chain - the path of the main chain determines the Calpha-Cbeta vector
• side chain rotamers - coordinated - entropy effects - dihedral angles of the side chains are critical!
• Can think of a packed protein interior as a “3D jigsaw puzzle”
• small cavities can occur
Supersecondary Structure
• Simple assemblies of 2-3 secondary structure elements • Include turns (to re-orient the chain) • Modules - used to build up the 3° level folds • Generally not stable on their own • May or may not be folding intermediates
Used to build … Specialized function? β-hairpin antiparallel sheets α -hairpin helix bundles β - α - β parallel sheets, TIM greek key helix-turn-helix DNA binding EF hand Ca++ binding etc.
Much of the material in this section is from: Introduction to Protein Structure (Branden and Tooze) Loops and hairpins/turns
Loop regions connect helices and sheets and are typically on the surface of the protein
Strands are frequently joined by short beta-hairpin loops The β-turn / reverse turn
A reverse turn is a region of the polypeptide typically having a hydrogen bond from one main chain carbonyl oxygen to the main chain N-H group 3 residues along the chain (i.e. Oi to Ni+3).
Alternative definition: Cα’s of residues i and i+3 must be 7Å apart
Reverse turns are divided into classes based on the Φ and Ψ angles of the residues at positions i+1 and i+2.
Current classification of ß-turns include 9 distinct types: (see next slide) I,I',II,II',IV, VIa1, VIa2, VIb and VIII
Turns are responsible for the compact globular shape of proteins because of the ability to reverse the protein chain direction within a span of several residues.
Note: Not all changes in direction are due to beta-turns. It is very common to connect alpha helices and beta strands with regions of non-regular structure. Classification of different types of β - turns
Otherwise unclassified / difficult to assess
Types I and II are the most common reverse turns, the essential diference between them being the orientation of the peptide bond between residues at (i+1) and (i+2).
These definitions are approximate (+/- 30°) Type I and II reverse turns
Note that the (i+2) residue of the type II turn lies in a region of the Ramachandran plot which can only be occupied by glycine. From the diagram of this turn it can be seen that were the (i+2) residue to have a side chain, there would be steric hindrance with the carbonyl oxygen of the preceding residue. Hence, the (i+2) residue of type II reverse turns is nearly always glycine.
Position (i+1) can accommodate a proline - these are often seen here, especially in type II turns. Type I’ and II’ turns
This type of turn often connects two consecutive beta strands into a “beta hairpin”.
For type I' turns, residue i+2 is “always” glycine whereas for type II' turns residue i+1 is “always” Gly. This is because amino acids other than glycine would cause steric hindrance involving the residue's side chain and the main chain.
Positions i and i+3 have little sequence preference. Other types of linker motifs: β - corners
β-corners: a 90° change in direction found in anti parallel beta strands Other types of linker motifs: helix hairpins
Residue 1 is in the bridging or α-helical region of Ramachandran plot. Residue 2 is always glycine in a region of the Ramachrandran plot that is only available to this residue Other types of linker motifs: αα corners Relative frequencies of amino acid residues in 2o structures
α-helices: • rich in alanine, glutamate, methionine and leucine - long side chain extends away from helix • beta-branched residues (valine, threonine, and isoleucine) tend to destabilize helix due to steric clashes • Serine, aspartate, and asparagine disrupts helix - competes for main-chain NH and CO groups • Proline lacks NH donor, preceding residue does not favor angles at psi ~ -50°.
β-sheets: • rich in valine, isoleucine and phenylalanine • beta-branched readily accomodated since side chain project out of peptide plane. • Prolines lack NH donor
Turns: • rich in glycine, asparagine, and proline
Examples of supersecondary motifs
βαβ motif β hairpin αα motif αα motif or Helix-turn-helix motif’s
helix-turn-helix (often involved in DNA binding)
(a) DNA binding motif (b) Calcium binding motif EF-hand DNA binding helix-turn-helix motif
turn
The transcription factor SATB1 Second helix generally binds major groove of DNA (H-bonds and VdW)
N-terminal helix stabilises the interaction Calcium binding motif or EF hand • Two helices flanked by a loop of 12 residues • Key residues in loop are under structural and functional selection and therefore are conserved (e.g. Glu, Gly) - required for Ca binding
Troponin-C is made up of 4 EF motifs - two of which bind Ca DNA binding basic helix-loop-helix motif
DNA binding helix with basic residues
Obligate dimers - Second helix forms the dimer interface (here through a Leucine zipper coiled-coil) C-myc bound to DNA Helix-Helix associations
Can be intramolecular or intermolecular
Consider parallel and antiparallel here, but many other crossing angles occur. Parallel coiled-coil (e.g. GCN4 transcription factor; bzip)
g
d c
a f
e b
MKQLEDKVEELLSKNYHLENEVARLK abcdefgabcdefgabcdefgabcde
“Leucine zipper”: a positions: M, V, N*, V d positions: L, L, L, L 3.6 residues/turn ---> 100° /residue
Often talk about a heptad repeat - but 7x100° is 700° , or 1.94 turns of helix. (Need to go 18 residues before reaching an integral number of turns).
The “7-pointed star” in helical wheel projection assumes 102.8° turn /residue.
abcdefg/abcdefg/abcd straight helix axis supercoiled helix axis
d
a
Often see a “heptad repeat” in natural protein There are 3.6 residues/turn in an alpha helix, sequences. These almost always fold as a so an alpha-helix twists by 100°/residue. “coiled-coil”.
1 turn = 360°, and 360°/3.6 = 100° In a coiled-coil, the helix axis is itself a helix. But can plot as a helical wheel with a straight 5 turns x 3.6 residues/turn = 18 residues axis in which the twist (turn/residue) is 5 turns = 1800° (divisible by 360°) (2x360°)/7 = 102.8°
These are values for ideal helices (unbent). The local twist is still 100° NOT 102.8° Positions a and d are typically designated as the interface residues in a coiled-coil. Many variations on the theme.
parallel antiparallel
dimer trimer tetramer pentamer etc.
Contacts are Knob into hole (or ridges in grooves) not “zippers”
Walshaw and Woolfson JMB (2001). Apostolovic et al.,
Chem. Soc. Rev.,(2010). Coiled-coils are often used in structural proteins
Myosin tropomyosin (schematic) Alpha Domains - contain only alpha helix
Two common motifs: Globin fold: “bag” of 8 helices arranged at +90° and +50° angles to each other - get large hydrophobic pocket in the domains interior which large hydrophobic organics and organometallic groups can bind to. Named after myoglobin.
4 helix bundle: 4 anti-parallel helices each crossing the next at an angle of about -20° Four-helix bundle (up-down-up-down)
- one of the simplest folds - widely used as a protein design target - two consecutive α -α motifs - all helix-helix contacts are antiparallel Beta Domains
Pattern of connection between sheets gives rise to sheets with two distinct topologies:
Greek Key motif Up-and-down motif Connections between adjacent β-sheets
Anti-parallel β- Parallel: Parallel: hairpin right handed left handed connection connection (almost always) (very very rare) β-hairpin
Bovine Pancreatic Snake Venom toxin Trypsin Inhibitor (BPTI)
Greek Key
Staphyloccocus nuclease Beta sandwich
The variable (VL) and constant (CL) domains of a light chain
Sandwiches with 3+4 strands
(with insert (red) in the variable domain)
Has two Greek key motifs
Sandwich barrel distinction - not absolute - some Ig folds the first or seventh strand switches sheets forming a partial barrel The jelly roll fold Sheet facts
• Repeat distance is 7.0 Å • R group on the Amino acids alternate up-down-up above and below the plane of the sheet • 2 - 15 amino acids residues long • 2 - 15 strands per sheet • Average of 6 strands with a width of 25 Å • parallel less stable than anti-parallel • “always” twisted β−α−β-motif
TIM barrel
• parallel beta-strands connected by longer regions containing alpha-helical segments
• almost always has a right- handed fold Alpha/Beta domains
Two examples of how the beta-alpha-beta motif can be used to build up tertiary structure.
Triosephosphate isomerase (TIM) Nucleotide binding domain (Rossmann fold) Alpha+Beta domains
Contain both beta sheet and alpha sheets - but typically segregated No special organization principles Usually just clusters of interacting helices Beta sheets tend to be anti-parallel or mixed
TATA-binding protein
Same types of 2° elements can come together in many diferent ways
TIM DHFR
Often classified according to arrangement of 2° structural elements in linear sequence and in space
Tertiary structure creates a complex surface topology that enables proteins to interact with small molecules or other proteins Repetitive structures are common
LRR (Leucine-Rich Repeats) Beta-loop-alpha E.g. Ribonuclease inhibitor β-helix topology E.g. Pectate lyase C N C
Here, each repeat is 22-28 residues long (individual repeats do not fold on their own)
Interactions that hold 3° structure of protein together can include cross linking elements, although primarily driven by the hydrophobic efect:
Disulphide bonds (eg. BPTI)
Metals (eg Ca2+ in subtilisin)
Cofactors (eg Heme in myoglobin) Enzyme active sites are often at turns
A selection of domains involved in protein-protein interactions
From: Protein Structure and Function (Petsko and Ringe) Many proteins are modular and are made up of smaller domains - eukaryotic signaling, transcription, etc.
Examples of protein domain databases: Interpro: www.ebi.ac.uk/interpro Pfam: www.sanger.ac.uk/resources/databases/Pfam SMART: http://smart.embl-heidelberg.de Prosite: www.expasy.org/prosite Divide and conquer is not always the answer!
Proline-rich
Mutidomain signaling switch: Src family tyrosine kinase Hck. Switch from “open” (not self-associated) and “closed” (self-associated). F. Sicheri, J. Kuriyan