<<

FunctionalFunctional MotifsMotifs

VariousVarious SourcesSources

Lecture 7 Biochemistry 4000 Slide 1 Functional Motifs

Functional Motif Definition(s): Originally – a structural motif that performs a biological function ● Short continuous stretch of primary sequence ● Defined in terms of architecture

Bioinformatics era – any primary sequence pattern that is associated with a biological function. ● Sequence fragments from anywhere in the primary sequence Helix-loop-helix ● Defined in terms of primary sequence

Why has the term 'evolved' (opinion)? 102 57 ● Bioinformatic sequence annotation relies heavily (and successfully) on primary sequence pattern searches 195 ● Use of primary sequence information is far more widespread than the use of structural information (eg. Majority rules)

Catalytic triad of serine proteases Lecture 7 Biochemistry 4000 Slide 2 Helix-loop-helix (EF-hand) (Ca2+ binding)

Helix-loop-helix: Ca2+ binding motif composed of two orthogonal helices and a connecting loop of 12 residues (~30 residues) ● Helices within a single motif make few contacts (left) ● Typically occur in pairs forming a 4-helix orthogonal bundle (right)

EF-hand: Older name derived from original structural studies on parvalbumin ● Helix E & F form the helix-loop-helix

Intended to describe 3D shape

Lecture 7 Biochemistry 4000 Slide 3 Ca2+ binding loop

Ca2+ binding site: 5 axial and 2 equatorial ligands coordinate a central Ca2+ ● Pentagonal bipyramid coordination ● All contacts are from 12 residue loop

Highly conserved Asp and Glu side-chains make 5 contacts with the Ca2+ (remaining contacts are from the main chain and/or a bridging water)

Numbering represents sequence position within the 12 residue loop.

Positions 7 and 9 contribute main-chain contacts and residues at these positions are not conserved

Lecture 7 Biochemistry 4000 Slide 4 Sequence Conservation in Helix-turn-Helix Motifs

Primary Sequence Logo for known Helix-turn-Helix Motifs

Represents all known residues at each position in an alignment of all helix-turn-helix motifs.

Schematic representation of a helix-turn-helix motif (n stands for non-polar)

Conservation (y-axis) vs sequence position (x-axis) shown by height of each position

And

Frequency of residues at each sequence position shown by height of individual one-letter codes

Lecture 7 Biochemistry 4000 Slide 5 Sequence Conservation in Helix-loop-Helix Motifs

Primary Sequence Logo and Helix-loop-helix structure

Calmodulin helix-loop-helix

Position 1 is invariant – Asp coordinates axial position of Ca2+ Positions 3 and 5 – two Asp (or oxygen containing residues) coordinate equatorial positions of Ca2+ Position 12 – Glu that forms bidentate (two) interactions with equatorial positions of Ca2+ Position 6 – Gly due to main-chain conformation

Remaining highly conserved positions contain non-polar residues that stabilized the helix-loop-helix structure Lecture 7 Biochemistry 4000 Slide 6 Structural comparison of Helix-loop-helix motifs

Structural superposition: Least-squares minimization of atomic coordinates allows structures to be superposed (position relative to a common origin)

Graphical representation of structural similarities and differences

Superposition of several structures from families of Helix-loop-helix Ca2+ binding

A) Calmodulin family

B) Parvalbumin family

C) Troponin family

In all cases, the helix-loop-helix motifs adopt similar structures (with small root mean-square deviations)

Lecture 7 Biochemistry 4000 Slide 7 Structural comparison of Helix-loop-helix motifs

Structural superposition: another example

Structural superposition of Calmodulin bound to specific inhibitors:

Grey: Calmodulin bound to four trifluoroperazine (TFP)

Blue/Red: Calmodulin bound to KAR-2

Structural differences observed for calmodulin bound to different inhibitors

Is comparable to

difference between Calmodulin family members (previous slide, panel A)

Lecture 7 Biochemistry 4000 Slide 8 Conformational Change upon substrate binding

Ca2+ free (apo): Helices of Helix-loop-Helix are roughly parallel and the loop directs conserved Asp/Glu residues into bulk solvent

Ca2+ bound (holo): Helices of Helix-loop-Helix are roughly orthogonal and the loop wraps around Ca2+ directing the Asp/Glu residues at the ion

Conformation change upon Ca2+ binding uncovers a hydrophobic Exposed non binding uncovers a hydrophobic polar surface surface that is a protein and peptide binding site

Facilitates role in signal transduction as the hydrophobic surface modulates the activity of proteins that it binds

eg. Calmodulin dependent protein kinase

Lecture 7 Biochemistry 4000 Slide 9 Helix-loop-Helix Family

Canonical EF-hands: Helix-loop-helix motifs with 12 residue loops that bind Ca2+ using conserved Asp/Glu residues at position 1, 3, 5 and 12. Note: All previously discussed Helix-loop-Helix proteins

S100 Apo Pseudo EF-hands: Helix-loop-Helix motif of the N-terminus of S100 proteins. Loop of 14 residues that binds Ca2+ using carbonyl oxygens at positions 1, 4, 6 and 9

Lecture 7 Biochemistry 4000 Slide 10 Helix-loop-Helix Proteins

Functional Roles: Two classes of Helix-loop-Helix proteins

1) Signaling proteins Larger group that includes calmodulin, troponin and S100 - all undergo Ca2+ dependent conformational change

2) Transport/Buffering proteins Calbindin D9K only - does not undergo conformational change

Phylogenetic tree for Helix-loop-Helix family proteins - circles=canonical, squares=pseudo, solid are known to bind Ca2+

Lecture 7 Biochemistry 4000 Slide 11 Helix-loop-Helix Proteins (Humans)

More than 100 human proteins contain a Helix-loop-Helix functional motif (2009)

Lecture 7 Biochemistry 4000 Slide 12 Helix-loop-Helix Diversity

Eucaryotes: Helix-loop-helix proteins have primary roles in signal transduction.

Q? Do procaryotes have helix-loop-helix proteins? A. Yes

Procaryotic helix-loop-helix proteins are more diverse than eucaryotic proteins. Role maintaining Ca2+ homeostasis and signaling in bacterial

Procaryotic helix-loop-helix proteins have a greater diversity of loop sizes (9, 10, 12, 15) and interhelical packing

Divergent procaryotic EF-hand like protein Different loop size and interhelical packing with same Ca2+ coordination

Lecture 7 Biochemistry 4000 Slide 13 Detecting Helix-loop-Helix Proteins

Sequence pattern: Derived from structural studies and primary sequence alignments

Automated identification of Helix-loop-Helix Proteins is successful in > 80% of cases

ProSite

Automated annotation of primary sequence based upon known functional motifs identified from sequence

Lecture 7 Biochemistry 4000 Slide 14 Comparing Protein Structures

Quantifying Structural Similarity: Domain fold classifications (ie. SCOP, CATH, etc.) are based upon backbone 'structural' similarity between proteins of known

Difficult to quantify !!! RMSDRMSD == Virtually impossible to come up with a single value that represents 3D structural similarity

where xi and yi are equivalents Techniques for quantifying structural similarity: atoms in the two structures (x Most (all?) approaches are based-upon the superposition or and y) being superposed structural alignment of 2 (or more) structures.

Superpositions or structural alignments are typically calculated by minimizing the RMSD (root mean square deviation) of equivalent atomic coordinates

Note: there are many different algorithms for calculating superpositions that primarily differ with respect to the amount of user input required and the underlying mathematics

Lecture 7 Biochemistry 4000 Slide 15 Comparing Protein Structures

RMSD from superposition:

RMSD values are strongly dependent upon: 1 – atoms used for superposition (main, side, domain, …) 2 – size of protein (and resolution of structure) 3 – large outliers (ie. Regions of structure that are far apart in the superposition 4 – insertions and deletions in primary sequence

Without information regarding the number and identity of atoms used in the superposition, the RMSD values are largely meaningless

Examples Identical protein structures (300 residues) in different space groups (ie. Independent X-ray structure determinations); Homologs (same length) sharing 80%, 50% and 30% sequence identity

Superposition Residues ~ RMSD Cα atom All 0.5 Å Note: Superposition of NMR and X-ray All atoms All 1.5 Å structures generally produces (slightly) larger RMSDs than superpositions of two 80% Cα atom All 1.0 Å NMR or two X-ray structures - Experimental differences in the account 50% Cα atom All 1.5 Å for observation 30% Cα atom All 2.0 Å

Lecture 7 Biochemistry 4000 Slide 16 Comparing Protein Structures

Divergent loop

Two views of the superposition of the N-terminal domains of Calmodulin (blue) and Troponin C (red)

Entire polypeptide (141 residues): RMSD (all Cα atoms) 2.21 Å

N-terminal domain (74 residues): Superposition of Calmodulin RMSD (all Cα atoms) 1.35 Å (blue) and Troponin C (red) - 45% identity with 2 insertions / deletions Virtually all cases, loops (and straps) are sites of greatest divergence in the superposed structures.

Lecture 7 Biochemistry 4000 Slide 17 Comparing Protein Structures

Divergent segments

Superposition of the pseudo-EF hand S100A (blue) and Superposition of Parvalbumin (blue) and Troponin C (red) Troponin C (red) - 22% sequence identity and 2 insertions / deletions - 23% sequence identity and 4 insertion / deletions

S100A and TNC (72 residues): Parvalbumin and TNC (100 residues): RMSD (all Cα atoms) 3.02 Å RMSD (all Cα atoms) 2.27 Å

Lecture 7 Biochemistry 4000 Slide 18