PyMOL Modelling Workshop

My website: http://pldserver1.biochem.queensu.ca/~rlc/work/teaching/BCHM442/

There you will find links to download the latest educational version of PyMOL as well as a link to my “Introduction to PyMOL”, which in turn has links to other people's PyMOL tutorials. Note also the PyMOL Wiki: http://pymolwiki.org.

Structure files can be found by searching the Protein Data Bank (PDB) for structure: easy to remember website http://www.pdb.org. There is also the PDBe (PDB Europe, http://www.pdbe.org) that contains the same databank of structures, but with a different web interface for searching for structures and a different set of tools for analyzing structures.

What is in a PDB file? Lots of information in the “header” (the section of the file preceding the actual atomic coordinates) as well as the coordinates for the atoms.

When assessing a structure, one needs to take account of the resolution and R-factor, error estimates and missing residues. There is information about the sequence that was used to determine the structure with a sequence database reference. There is also information about the biological unit. In the case of structures the biological unit may need to be generated by applying crystallographic symmetry operators, although there are web sites that also try to provide that information. (e.g. http://www.ebi.ac.uk/msd-srv/pisa/).

Warning: One cannot blindly trust a crystal structure to be providing you with a completely accurate picture of reality. Reading the paper that describes the structure is a good start!

Outline of PyMOL usage

PyMOL is a very powerful, scriptable (customizable) tool for making publication-quality figures and performing analyses on structures. PyMOL is not the only good program for looking at and analyzing protein structures and making pretty pictures. Other similar programs are VMD (www.ks.uiuc.edu/Research/vmd/) and Chimera (www.chimera.ucsf.edu/) and they are also both available for , Windows and Mac OSX. If you are already familiar with VMD or Chimera, you can use one of them for the assignment.

In the text below, bold italics will represent PyMOL commands that can be typed in the PyMOL command line box. PyMOL workshop 2

Basic use of PyMOL • help commands for list of many commands • help and ? functions • fetch structures • e.g. fetch 3bow • Functions of “A”, “S”, “H”, “L” and “C” menus next to object name • Structure display modes: ◦ the difference between "show" and "as" and "hide" ▪ line, stick ▪ cartoon ▪ sphere, nb_sphere ▪ surface and mesh • Colouring ◦ by atom type (with variety of colours for carbons) ▪ note first option of by element of “C” menu where colour of carbons is not changed ◦ by chain ◦ by secondary structure ◦ by B-factor ◦ by custom user-defined options ▪ e.g. colour everything lightblue and colour cysteines yellow • Selections ◦ selection by clicking on a residue ◦ selection via the menu and sequence viewer ◦ selection via the command line ◦ help selections ◦ modifying selections via the menu (or command line) ▪ around, expand, within, extend (by n bonds) ▪ special selection keywords: polymer, organic, solvent, backbone, sidechain, etc. ▪ selection by b-factor, secondary structure etc. ◦ http://www.pymolwiki.org/index.php/Selection_Algebra • More display modes ◦ depth cueing and ray-tracing ◦ mixing display modes ▪ cartoon for backbone and stick for just the side chains ◦ surface ▪ to display surface of enzyme, may need to create a separate object as we'll see with some of the enzyme/inhibitor and protein/DNA complexes • Dividing up structures ◦ Using the 3bow structure, separate the three chains into separate new objects. ▪ can be done with the menu – select a chain (pkchain -> copy to object) ▪ can be done on command line – create 3bow_a, 3bow & c. a • Similarly can merge structures together ◦ create dimer, 3bow_a or 3bow_b PyMOL workshop 3

After creating a view that you like, you can save it as a session. Reloading that session into PyMOL will bring you back to where you were.

Measuring distances and angles

There are two main ways to make measurements in PyMOL.

• via the measurement “wizard”. Activating that creates an extra menu near the bottom right corner where you can choose what measurement you wish to do. You are prompted to click on atoms to measure. • via typing “distance” in the command line after picking two atoms (or “angle” after picking three or “dihedral” after picking four)

After creating a measurement object you can change the colour via the “C” menu and turn on and off the labels via the “S” and “H” menus (or via the command line)

Displaying H-bonds • “A” -> find -> polar contacts -> within selection (+ other selection options) ◦ involving side chains, solvent, just intra-main chain, etc. PyMOL workshop 4

Aligning homologous structures • Easy example first (B. circulans xylanase versus T. harzianum xylanase): • fetch 1xnb 1xnd Note that these two structures are nearly identical, but not completely. How can you find the areas that are most different? Hint, use the align function's “object” command to locate the areas that match and make use of the fact that the alignment object's name is also a selection. E.g. after aligning 1xnd onto 1xnd (using the “A”->align -> to molecule menu item) there will be a new object representing the aligned atoms called aln_1xnd_to_1xnb. You can then colour the objects with color grey, not aln_1xnd_to_1xnb and you will see that the similar regions are still coloured their original colours while the more divergent regions are now coloured grey. Note that this alignment selection only applies to the C-alpha atoms. • We can save the alignment object as a clustal format alignment file by typing save 1xnd-1xnb.aln, aln_1xnd_to_1xnb ◦ Note that only the atoms used in the alignment are “aligned” in the clustal file

• One can also calculate an RMS distance between the structures and colour them according to that using my rmsd_b.py script.

A script is a python program that extends the functionality of PyMOL by defining a new function that one can run within PyMOL (or occasionally by running a set of commands directly). In order to load such a function into PyMOL, the script file must first be run. Either find the script via Run... in the File menu or type: run rmsd_b.py

Then we can calculate the rmsd for each Cα that are aligned: rmsd_b 1xnb & aln_1xnd_to_1xnb, 1xnd & aln_1xnd_to_1xnb Note that this rmsd_b function requires that the two selections have the same number of atoms, so I include the alignment object name in the selection in order to make sure this is the case.

After running rmsd_b we can colour the objects via their B-factor values using the spectrum option under the C menu or by running my color_b.py script: color_b 1xnb & aln_1xnd_to_1xnb color_b 1xnd & aln_1xnd_to_1xnb

Notice that the B-factor column in those structures was not changed for the regions that are not part of the aligned regions so we have to be careful how we do the colouring. Note also that the PyMOL workshop 5

Aligning structures (a more difficult example) • More difficult example: e.g. 1pop (papain) versus 2io8.pdb (E. coli glutathionylspermidine amidase) ◦ In these structures, Cys25, His159, Asn175 (1pop) and Cys59, His131 and Glu147 (2io8) form the catalytic triad ◦ See: PMID: 21226054 ◦ We cannot simply use the align command on the whole structure here.In order to do this alignment, we'll identify the active site triad. For papain, that would be Cys25. Look for the His159 and Asn175 residues nearby. Left click on those three residues to select them. This should create a new object called “sele”. Rename that object to “1pop_triad”. The abstract for the paper describing the E. coli glutathionylspermidine amidase structure mentions the catalytic residues, so let's align papain's to those: ◦ align 1pop & 1pop_triad, 2io8 & chain A and resi 59+131+147, object=aln ◦ Note that the Asn and Glu residues are not actually used in the alignment. ◦ One could use the pair-fit command here to specify exactly which atoms to match PyMOL workshop 6

Map out ligand interaction site • examine contacts between inhibitor and enzyme • fetch 1tl9 • use sequence tool to find inhibitor and select it ◦ Change name of inhibitor selection from sele to inhib ◦ Show inhibitor as sticks ◦ Show enzyme as surface – note the need to create a separate object for the enzyme in order to display just the surface of the enzyme. ▪ Not necessary if inhibitor is listed as a HETATM entry in the PDB file.

• map ligand-protein contacts • for inhib selection, find polar contacts to other atoms, not including solvent • find non-polar contacts between inhib and protein (easiest by using the distance command along with selections that include only the carbon atoms) ◦ note that one may have to increase the cutoff distance to see many non-polar contacts

• user iterate command to obtain list of contacts: ◦ iterate 1tl9 & chain A & polymer & elem C within 4 of 1tl9 & chain B & elem C, print resi,resn,name

or if you prefer doing that in steps (including displaying the contacts): ▪ select 1tl9_a, 1tl9 & c. a & polymer & e. c ▪ distance hydrophobic_contacts, 1tl9_a, inhib & e. c, 4 ▪ select 1tl9_a_contacts, 1tl9_a within 4 of inhib & e. c ▪ iterate 1tl9_a_contacts, print resi,resn,name

Calculate surface areas • definition of molecular and solvent-accessible surfaces ◦ probe radius in comparison to water radius • use get_area: ◦ e.g. get_area 1tl9 &! inhib &! r. hoh within 4 of inhib ◦ show surface 1tl9 &! inhib &! r. hoh within 4 of inhib ◦ set dot_solvent, 1 (to force calculation of only solvent-accessible areas) PyMOL workshop 7

Mutagenesis wizard

We can alter the structure in PyMOL. Switch mouse mode to “editing” mode by clicking in the bottom-right region of the graphics window where the mouse functions are described. For example, to change the torsion angle of a bond, hold down the Ctrl key and right click and drag on a bond. Note that you have to pick the correct end of the bond to achieve your desired result.

We can use PyMOL to make mutations. For example, we could build a disulfide in a protein that did not have one before. • Load 1xnb • use the “Disulfide by Design” website (http://cptweb.cpt.wayne.edu/DbD2/index.php) to analyze the structure for possible disulfide sites • enter the 1xnb PDB code and run the program • Use PyMOL's mutagenesis “wizard” tool to mutate the pairs of residues to CYS • Alter CYS side-chain χ (chi) angles as necessary to bring S atoms within about 2 Å of each other. • Pick the two S atoms and type “bond” to form a bond between them PyMOL workshop 8

A protein/DNA complex (fetch 1akh) This is a structure of a DNA fragment with two proteins from yeast. Chain A is the mating-type A-1 protein and chain B is the mating-type alpha-2 protein. • try PyMOL's simple electrostatic display (note warning in console about missing atoms) • try my color_by_attype.py script, which can be downloaded from my web site • run color_by_attype.py • color_by_attype 1akh • Use the distance command to look at the interactions between the two protein molecules e.g. distance cha-chb, chain A, chain B, 3.5 • Now limit that to strictly hydrophobic interactions distance phobic, chain A & not elem o+n, chain B & not elem o+n, 3.5 • What about the protein-DNA contacts are they hydrophobic or polar contacts? distance chAB-DNA, chain A+B & not elem o+n, chain C+D & not elem o+n, 3.5

In order to best visualize the types of interactions, it is often best to separate the structure into the component molecules in order to display one of them in surface mode, while the other(s) are shown in the cartoon or stick representation. To do this select a portion of the structure and use the copy_to_object item under the “A” menu of the selection or use the create command: • create chnA, chain A • create chnB, chain B • create DNA, chain C+D PyMOL workshop 9

Building small molecules and sculpting

Click on “Builder” button in external GUI. One can chose between building peptides or other chemical structures by clicking on the “Protein” or “Chemical” buttons on the left side.

Note that for building and minimizing the energy of small molecules, the program Avogadro (http://avogadro.openmolecules.net/) is better than PyMOL. PyMOL workshop 10

Use of symmetry to generate biological molecule • fetch 2pc0 (a structure of the HIV protease) • note that it is missing half the structure • generate -> symmetry mates -> within 4 Å • note that several surrounding molecules are now present and we need to figure out which is the correct one ◦ take note of the contacts between them ◦ re-color the original structure red and turn on and off the symmetry related objects until you find the symmetry object that makes the most contacts (Should be 2pc0_05000000)

NMR structures (or other multi-state objects such as from modelling) • fetch 1e88 • play as movie ◦ note that one part of the structure (central domain) moves around a great deal while the N-terminal and C-terminal domains move much less. • show all structures with: set all_states, 1 • show one structure with: set all_states, 0 • align (intra_fit) N and C-terminal domains (use sequence display to select the domains to align) • align (intra_fit) central domain • download my rmsf_states.py script • run rmsf_states.py • display cartoon as putty by typing: cartoon putty, 1e88