<<

Examples of Modeling

n Visualization

n Examination of an experimental structure to gain insight Protein Modeling about a research question n Dynamics

n To examine the dynamics of protein structures

n To examine binding free energy differences of ligands

n

n To explore fit of a small molecule against a protein

n Computational Model Development

n prediction from sequence

n Homology modeling

CHEM8711/7711: 2

Protein Structure Description Primary Structure

Protein Structure can be Described at Several Levels O H2N CHC OH CH n 2 Primary O CH + 2 H2N CHC OH CH n Linear sequence of amino acids in protein chain 2 CH3 NH Alanine n Secondary C NH NH2 n Three-dimensional local conformation

N-terminus O O C-terminus n Tertiary H H2N CHC N CHC OH n Overall fold of an entire protein chain CH3 CH2 CH2 n Quaternary CH2 NH n Overall shape of a multi-chain protein C NH

NH2 Alanylarginine - primary structure conventionally described starting from the N-terminus CHEM8711/7711: 3 CHEM8711/7711: 4

Protein Sequence Sources Importing Sequences to MOE n GeneBank: www.ncbi.nlm.nih.gov/Entrez/ n Option I n Copy the sequence into a text file n Protein Databank: www.rcsb.org (not limited to primary structure) n Insert a line prior to the sequence that starts with >, followed by a name for the sequence n Swiss-Prot: www.expasy.ch/sprot/ n Save the text file with a .fasta extension n Option II

n Display the file as fasta in the source database

n Save with a .fasta extension n Open the file in MOE, and view the sequence from the sequence editor

CHEM8711/7711: 5 CHEM8711/7711: 6

1 Class Exercise I n Use one of the protein databases to locate several related protein sequences (ask for suggestions if you aren’t currently interested in any ) – One guarantee of relation is to find proteins with the same name from different species n Import them into MOE n Make sure you know how to select residues (single residues, continuous stretches of residues, and scattered residues)

CHEM8711/7711: 7 CHEM8711/7711: 8

Class Exercise II Protein Secondary Structure n Align the sequences that you have imported n Examples

into MOE n n From the sequence window choose n Beta sheet Homology->Align n Beta n Open the commands window to see the n Major stabilizing contributions pairwise residue identities for your aligned n Hydrogen bonding

sequences n Relief of steric crowding

CHEM8711/7711: 9 CHEM8711/7711: 10

Useful MOE Tools Notes on Experimental Structures

n Experimental protein structures are determined mainly n Sequence Window by two methods: n Display menu allows you to highlight actual n NMR 1 1 secondary structures (red=helix, yellow=sheet) n Often by H- H NOE enhancements

n Display menu allows you to highlight hydrogen n Structures are then modeled to be consistent with the bonding (generally only for the backbone) distance data derived from the spectra n X-ray crystallography n Main Window n X-ray diffraction patterns are dependent on electron density

n Render->Draw menu allows you to show n Hydrogen atoms have negligible electron density and are hydrogen bonds and protein ribbon diagram not present in x-ray structures n The O and N of a terminal amide have similar electron density and are often placed on the basis of expected hydrogen bonding CHEM8711/7711: 11 CHEM8711/7711: 12

2 Class Exercise III Protein Tertiary Structure

n Download a protein structure from the n Covalent stabilization:

Protein Databank and add hydrogens (Edit- n Disulfide bond formation

>Add Hydrogens) n Non-covalent stabilization:

n Isolate an alpha-helical secondary structure n Hydrophobic sequestration from water (entropy

n Examine hydrogen-bonding in that region driven)

n Examine spacing of the sidechains n Salt bridge formation (enthalpy driven)

n Isolate a beta-sheet secondary structure (you n Hydrogen bonding (enthalpy driven) will need non-contiguous parts of the sequence) and examine similarly

CHEM8711/7711: 13 CHEM8711/7711: 14

pH, pKa and Ionizability Class Exercise IV

Electrostatic (charge) interactions are an n Show all atoms of your protein as a space- important feature of protein structure and filling model activity - models cannot ignore ionizability and n Select all atoms that are part of hydrophobic charge residues n Ammonium pKa ~9, thus will remain cationic in water - this generalizes to amine groups in proteins n Visually decide how well they are screened n Carboxylic acid pKa’s ~5, thus will lose their protons away from the in water - this generalizes to carboxylic acid groups in proteins

n Check out the section on amino acids in your text for other groups to be aware of

CHEM8711/7711: 15 CHEM8711/7711: 16

Class Exercise V Crystallographic Structures

n Crystallography does not identify hydrogen positions n Isolate all cysteine residues in your protein - they must be added n Are any involved in disulfide linkages? n Ionization of standard residues will be handled automatically (groups with pKa’s near 7, like histidine, should be manually checked) n Isolate all cationic and anionic residues in n Residues may be unresolved (missing) your protein n No partial charges included, must be assigned n Are any near enough in space to be n Non-standard residues may be incorrectly assigned participating in salt bridges? atom types n Resolution and crystal packing effects contribute to the fact that the structures are NOT energy minimized in your forcefield! CHEM8711/7711: 17 CHEM8711/7711: 18

3 Example: Missing Residues Example: Incorrect Atom Types

n Download PDB entry 1f88 n Download PDB entry 1ptr

n Identify regions in both chains that are n Add hydrogens

missing amino acids n Isolate the ligand and draw its structure O O n Chain 1 R O n residue 235 is not attached to residue 236 O R

n Chain 2 H3C CH n residue 142 is not attached to residue 143 HO 3

n Residue 221 is not attached to residue 222 OH

H3C OH

O CHEM8711/7711: 19 CHEM8711/7711: 20

Homology Modeling Homology Modeling Needs

n Founding Assumption: n Alignment of:

n Template sequence with known structure homologous primary structure n Target sequence with known sequence AND n Knowledge about the function of both homologous function proteins INDICATE n Knowing residues critical for function allows homologous tertiary structure examination of homology in those regions – Overlay of should be higher than overall homology ASV and HIV n Greater functional homology indicates likelihood integrase structures - 25% that proteins have a common ancestor sequence homology CHEM8711/7711: 21 CHEM8711/7711: 22

Homology Modeling Method Class Exercise VI

n Search the PDB for one of the proteins in n For regions of identical length: your set from exercises I and II n Protein target backbone is taken from template n Use this protein as your template structure, n Identical target sidechains are taken from template and any of the other sequences as your n Nonidentical target sidechains are derived from library target n For target sequence insertions (indels) n (If you find two in the pdb, use one as the n High resolution structures from PDB are scanned to template and the other as the target -> you can find those that have regions that superpose on the compare your modeled structure with the anchor residues on either side experimental structure when you are done) n Build a homology model (Homology ->Homology Model)

CHEM8711/7711: 23 CHEM8711/7711: 24

4 Evaluating Protein Models Class Exercise VII n Final structures after minimization may have structural n Evaluate your protein model and identify features inconsistent with known protein structural n Steric problems characteristics n Residues connected by cis-amide bonds n Examples

n Cis-amide bonds at locations other than proline n Dihedrals outside the expected ranges

n Incorrect stereochemistry at alpha or beta carbons n Reversed stereochemistry at alpha carbons n Strained bond angles

n Dihedral angles that match unpopulated regions of a Ramachandran plot

n Steric bumps

n Evaluation tools accessible in the sequence window

n Measure->Protein Report

n Measure->Protein Contacts CHEM8711/7711: 25 CHEM8711/7711: 26

Reading n First Edition

n Section 8.13 n Second Edition

n Section 9.10

n Chapter 10

CHEM8711/7711: 27

5