(Revised 8/27/08) Introduction to Molecular Modeling Lab 2 - Modeling Input files

Table of Contents

Objectives ...... 3

Setting up...... 3 Remote Access to Amber, Frodo, and Iris ...... 3 From a PC ...... 3 From a Unix Workstation ...... 4

Obtaining Protein and Oligonucleotide Structures ...... 5 Directory setup...... 5 Mozilla ...... 6 PC...... 6 Unix Machine ...... 6 Protein crystal structure files...... 6 Oligonucleotide crystal structure files ...... 7 FTP file transfer...... 7 Unix to Unix ...... 7 PC to Unix...... 9 Structure data & searching (coordinate files) ...... 9

Viewing .pdb files ...... 10 Protein .pdb file structure ...... 10 Resources for processing files - File format conversion ...... 12 Viewing with rasmol...... 12 Unix Machine (e.g. Amber, Iris, or Frodo) ...... 12 On a PC...... 13 Viewing .pdb files with Moilview ...... 14 Other programs for visualizing structure files...... 14 Image Databases ...... 14 Processed Image viewers and Editors...... 14 Basic Model Visualization Tools and Plug-ins...... 14 OnLine Structure Visualization (Java/VRML-compliant Web Browser)...... 15 xleap - Setting up for Molecular Mechanics and Molecular Dynamics ...... 15 The Universe Editor...... 15 The Unit Editor ...... 16 The Parameter Editing Table...... 17 Viewing the protein and oligonucleotide structures ...... 17 Printing Structures...... 18 Printing from a PC...... 18 Rasmol...... 18 Other PC programs ...... 18 Printing in the Molecular Modeling lab...... 18 Rasmol...... 18 Moilview ...... 19

Exercises ...... 19 3

Objectives

The objectives of this lab are to learn

1) how to locate the structures of proteins or oligonucleotides from web sources for molecular modeling, 2) down load structure to the workstation, 3) modify the file if necessary, 4) display it, and modify if desired. 5) Print your pictures.

This laboratory will require you to use the skills you learned in the first lab (Unix tutorial). If you did not complete that tutorial, you may have some difficulty with this laboratory. It would be a good idea to complete the tutorial. At a minimum, you should have the Unix tutorial lab handout as you work through this lab.

The flow of this lab will be as follows:

1) Set up your user account so that the required programs will be available to you. 2) Use netscape to browse protein and nucleic acid data bases for desired structures. 3) Retreive a protein and an oligonucleotide structure. 4) Run viewing programs to inspect the structures retrieved. 5) Print pictures (files).

Setting up

Remote Access to Amber, Frodo, and Iris

From a PC:

General instructions for remote access to to the lab computers have been given in lab 1 and should be followed here. There are some additional considerations that need to be pointed out. Here, some of the hardware and software you will find useful will be listed. Some of the software is freely available, some will have to be purchased, depending upon what you want to do and how reasonable it might be to come to the lab.

1) Hardware. Assuming you have a PC and an internet connection, there are only two pieces of hardware you might need.

First, a three button mouse is almost a must. Amber (specifically xleap), Sybyl and InsightII use a three button mouse for manipulating structures. If you do not have a three button mouse, 4

you ability to manipulate structures on the screen will be severely limited and it is strongly advised that you get one. Scrolling mice (the kind with the wheel/button) usually work (though sometimes they behave oddly) as well as mice with three buttons though the latter are slightly more convenient to use here.

Second, while more of a luxury than a requirement, a color printer is very useful for printing structures. A B/W printer is sufficient for simple structures, but for more complicated ones, color really helps.

2) PC Software: A few programs that will help you are rasmol (and or Weblabviewer Lite), a good FTP program (I recommend WS_FTP_LE for non-secure FTP, WINSCP for secure shell FTP), and a plotting program (I use either Sigma-Plot or Psi-plot. Any plotting program that can import ASCII files will work. For the PC savvy, you can download GRACE. The equivalent of this program, xmgr, is used on the Unix machines, Grace on the Linux machines in the modeling lab. Note that xmgr and Grace are really the same program by different names) rasmol or better (and there are PC, Mac, Linux, and Unix versions are available) and Weblabviewer Lite can be downloaded at no cost. The Websites are given below. Likewise, WS_FTP95 can be downloaded at no cost. If you find a good plotting program for a PC that is free, let me know.

rasmol http://www.umass.edu/microbio/rasmol/getras.htm Pymol http://pymol.sourceforge.net/ Viewer Comparison http://pymol.sourceforge.net/pmimag/compare.html (this site compares Pymol, Rasmol, MolMol, VMD, Chimera, DeepView, and Python, gOpenMol, Qmol, Biodesigner) WS_FTP9_LE http://www.ftpplanet.com/download.asp WINSCP http://winscp.net/eng/download.php

Some of the machines in the modeling lab require your computer to be running secure-shell. Below are sites from which secure-shell software can be obtained, at now cost, and for any .

PuTTY http://www.chiark.greenend.org.uk/~sgtatham/putty/ OpenSSH http://www.openssh.com/portable.html#http (This site has a list of mirror sites from which OpenSSH can be downloaded.)

All of these programs are self-executing and will set themselves up when you run them.

From a Unix Workstation

1) Login to you account on any of the three SGI computers and edit your .cshrc file. Add the following line to the end of this file using the text editor of your choice (eg nedit, jot, vi). You must do this on each machine you have an account on otherwise many of the programs will not work as describe in this and subsequent labs. Basically, this will set up you user environment so that you have access to the programs you need on each machine. It will also set up some aliases to make it easier to telnet and ftp between machines. 5

source /usr/local/amberr

or

source /disk02/usr/local/amberr

depending upon which machine you areon. After you have editted your .cshrc file, enter the command shown below.

source .cshrc

This is the only time you will have to enter this command as each time you login in this will be done for you.

Obtaining Protein and Oligonucleotide Structures

The input files you might want or need for many proteins and oligonucleotides can be obtained for databases you can access via the internet. It is not the purpose of this lab to explore all of these databases. Instead, one database for proteins and one for nucleic acids will be used to introduce you to what is available and to obtain a protein and an oligonucleotide structure. At the end of this section are web addresses for databases containing protein and oligonucleotide crystallographic information in addition to those used here. Another source is on the Computational Chemistry and Molecular Modeling Lab’s homepage - links (http://www.hsc.wvu.edu/sop/compchem/links.htm).

Directory setup

Prior to starting any project, it is good practice to set up a directory or directories to do your work in and to keep all the relevant files together. For this lab you will need to set up two directories.

1) Create the following subdirectories to put your work in. Type the following two commands:

mkdir ~/protein_data (~/ refers to your home directory, where you are when you login. If you are in your home directory, simply entering mkdir protein_data will work just as well.) mkdir ~/dna_data

2) Change to the ~/protein_data directory:

cd ~/protein_data 6

Mozilla

PC

If you are running from a PC, you must use your own browser to download the files and then FTP them to the Unix machines or SFTP them to the Linux machins.. Here, Amber is being used as an example. If you are on one of the other machines, substitute it’s machine name for Amber). If this is the case, follow the directions given below (Protein crystal structure files) and then, when you have retrieved the files you need, FTP them to Amber (see below).

Unix Machine

The SGI browser is Mozilla (in one or two cases, it may be Netscape and so type Netscape where it says, below, to type mozilla; we are updating as fast as we can). Internet Explore is not available....get use to it.. Before you begin, find out where you are. You should be logged into to Amber (or Iris or Frodo, etc). Second, you should be in your ~/protein_data sub-directory. If you aren’t, change to this directory now. You will be running netscape through Amber. Thus, when you retrieve files they will be stored on Amber.

At the Amber% prompt, type

mozilla

Most of the machAfter a pause, netscape should start.

Protein crystal structure files

In the field titled ‘Location’ enter the following address:

http://www.rcsb.org/pdb/

This will take you to the home page of the Research Collaboratory for Structural Bioinformatics. Click on advanced search. Click on Query Type and pick molecule name. Enter insulin and then click Evaluate Query. You’ll get over 200 hits. Click on advanced search, again, Click on Query type and select PDB ID and enter 4INS and then Evaluate Subquery. You should get one hit. Click on it. Then click on the download ICON (it looks like

and save the file.

You should already be in the correct place (~/protein_data). After you download the file, check and see if it is in the right place. Also, run either rasmol or pymol and load the PDB to make sure it is OK (you should see a molecule, if you get a blank screen, try again). If it is in the wrong directory move it to ~/protein_data. The file 4INS.pdb should now be in your ~/protein_data sub-directory. 7

Once you have obtained the 4INS.pdb file, you may want to explore the site some. Here, you have downloaded a file for a relatively small protein - insulin. You may be interested in other other proteins and you might try to search for them on this site. Feel free to download any that interest you (but don’t try to download all 8000-9000 crystal structures that reside in the data base.

Oligonucleotide crystal structure files

The procedure to get oligonucleotides is roughly the same as for proteins. Follow the directions up to the web address. Instead of the protein address, use:

http://ndbserver.rutgers.edu

Then, in succession, select the following links: ‘Search’, ‘NDB Search’. On the page you will see two sections (General... and Experimental). Under the experimental section. Using the radio buttons or pull down menus select the following search criteria: Classification DNA, Structure description Double Helix, Conformation type B. The click on the ‘Search’ button. You will get a bunch of structures back - about 1600. At the bottom of the page change the number of structures to be displayed per page to 1600 and enter. Scroll down until you find BD0004. This is the Dickerson Dodecamer. Click on the BD0004 link. Scroll down and click on ‘Biological Unit coordinates (pdb format)’. After the file downloads, click on ‘File’ on the tool bar, then ‘Save as’. Rename the structure to BD0004.pdb and then down load the structure to your ~/DNA_data subdirectory. Once you have retrieved this structure, you may want to investigate this site. Feel free to download any structures that interest you (eg, I’d download a triplex and a quad DNA structure). Again, please don’t try to obtain the whole database.

FTP file transfer

FTP stands for ‘File Transfer Protocol’ and is a convenient way to move files, especially large ones, from computer to computer. In the case of the Unix machines, there are no floppy disks and files are transferred to them from (1) CDROM, (2) DAT (digital-analog tape) or (3) over the internet (e.g. FTP). There are two situations you will likely encounter. The first is where you want to move a file from the machine you are on to a remote machine and the second is where the file you want on your local machine is on a remote machine. Examples are provide below in which a file is transferred from a remote machine to your local machine and the other way around.

Unix to Unix

For the purpose of these examples, assume the two machines are Iris (local or host) and Amber (remote or client).

To move a file from Iris to Amber: 8

1) Login and locate the file you want to move. It is simplier if you change to the directory the file is located in. Otherwise, you will have to remember the full path to the file.

2) FTP to Amber and login into your account. Do this by executing the command below. Amber will respond by requesting you login and password. Respond accordingly and change your directory to the location of where you want the file you are transferring placed. Again, it is simplier this way but you can specify the full path to this directory, too.

ftp amber

3) Transfer the file by typing the command shown below. The first command assumes on both local and remote machines you are in the correct sub-directories. The second assumes that you are not (still will work), that the file on the local machine is in /usr/people/turkey/data, that you want to copy the file to /usr/people/turkey/data_also, and that the file is called a_file.

put a_file

put /usr/people/turkey/data/a_file \ /usr/people/turkey/data_also/a_file

(Note: the symbol \ means that the command line is continued on the next line. So, the second put command should be typed all on one line. If you do type ‘\’ then enter, type the next line, and enter again.)

4) To transfer the file from Amber to Iris follow steps 1 and 3 except replace the send command with the following command:

get a_file

get /usr/people/turkey/data/a_file \ /usr/people/turkey/data_also/a_file

If you want to transfer multiple files you can use mput or mget (behave like put and get, respectively). Files for getting/putting can be listed, individually, on the command line or wild cards used. For example to get all the files with the extension .pdb in a directory, the command shown below would be used:

mget *.pdb

To send (put) all the files with the extension .pdb in a directory, the command shown below would be used:

mput *.pdb

Linux machines 9

Most of the commands listed above will work just fine Linux-Unix or Unix-to-Linux. However, when FTPing to a Linux box in the lab, you have to use sftp or scp from a Unix machine to a Linux machine. Alternatively, listed under Applications/internet is the program gftp. Once started it is fairly straight- forward to use. Likewise, from a PC to a linux machine, you have to use a secure shell based protocol. The program WINSCP will work for this purpose.

PC to Unix

File transfers to and from a PC must be done with the PC as the local machine (unless your running WinNT or linux). A very simple way to FTP files to and from a PC is as follows.

1) Open a DOS window (usually, Start/Program/MS-DOS). Change to the directory where you want files or where the files you want sent are. 2) Enter the command

ftp amber.hsc.wvu.edu

3) After you have connected, you will be asked for your login and password. Enter these and you will be in your home directory. Change directories to the one where the files are that you want, or to where you will move the files to. Files can be transferred from the Unix machine to your PC with:

get afile

Note that if you are transferring files other than text files you will need to first type ‘bin’ and then issue the get command. If you are unsure about the file type, type ‘bin’ first as either text or binary files can be transferred, without problems, after typing ‘bin.’

To send a file (called myfile) from your PC to a Unix machine (here assumed to be Amber), type the following command:

put myfile

If you are uncomfortable working in a DOS environment or find yourself transferring many files, it would be best to get a windows based ftp program (eg. WS_FTP95, WS_FTP_LE or the best choice is WINSCP). The PCs in the modeling lab (yoda, the machine that shares a monitor, mouse and keyboard with kali, and also hal) (CCMM) have the WINSCP installed on them. To use this program, double- click on the WinSCP icon to fire it up. Pick the machine you want to FTP to or, if it is not on the list, click new and enter the name of the machine (e.g. opal is opal.hsc.wvu.edu). In the latter case you will have to enter your username (on the remote machine). Save this (accept the default). It will now appear on the list. Click on it and then on Login. The rest is pretty self-explanatory.

Structure data & searching (coordinate files) 10

Below are list a number of sites that may be of assistance in finding structure files. The links were valid as of 8/13/05 but these things do come and go. So, if a link is dead, report it to me ([email protected]) and go on to the next one. http://gibk26.bse.kyutech.ac.jp/jouhou/3dinsight/3DinSight.html 3D-In-Sight http://www.ebi.ac.uk/thornton-srv/databases/pdbsum/ PDBsum http://molbio.info.nih.gov/cgi-bin/pdb Molecules-R-US http://www.ccdc.cam.ac.uk/products/csd Cambridge Structural Database http://ndbserver.rutgers.edu Nucleic Acids Database

Viewing .pdb files

Files obtained from internet databases sometimes (read usually) need modification. This may be due to errors in the files themselves or because you are using them as a starting point for a related structure. To be able to modify .pdb files you must first have an understanding of the file structure as discussed briefly below. Additional resources for processing files are given at the end of this section.

Protein .pdb file structure

The protein .pdb file has several sections. The first section of the 4ins.pdb file is shown below.

HEADER HORMONE 10-JUL-89 4INS 4INSA 1 COMPND INSULIN 4INS 4 SOURCE PIG (SUS $SCROFA) 4INS 5 AUTHOR G.G.DODSON,E.J.DODSON,D..HODGKIN,N.W.ISAACS,M.VIJAYAN 4INS 6 REVDAT 3 31-JUL-94 4INSB 3 HETATM 4INSB 1 REVDAT 2 15-JUL-93 4INSA 1 HEADER 4INSA 2 REVDAT 1 15-APR-90 4INS 0 4INS 7 SPRSDE 15-APR-90 4INS 1INS 4INS 8 REMARK 1 4INS 9 REMARK 1 REFERENCE 1 4INS 10 REMARK 1 AUTH E.N.BAKER,T.L.BLUNDELL,J.F.CUTFIELD,S.M.CUTFIELD, 4INS 11 REMARK 1 AUTH 2 E.J.DODSON,G.G.DODSON,D.M.CROWFOOT HODGKIN, 4INS 12 REMARK 1 AUTH 3 R.E.HUBBARD,N.W.ISAACS,C.D.REYNOLDS,K.SAKABE, 4INS 13 REMARK 1 AUTH 4 N.SAKABE,N.M.VIJAYAN 4INS 14 REMARK 1 TITL THE STRUCTURE OF 2ZN PIG INSULIN CRYSTALS AT 1.5 4INS 15 REMARK 1 TITL 2 ANGSTROMS RESOLUTION 4INS 16 REMARK 1 REF PHILOS.TRANS.R.SOC.LONDON, V. 319 369 1988 4INS 17 REMARK 1 REF 2 SER.B 4INS 18 REMARK 1 REFN ASTM PTRBAE UK ISSN 0080-4622 441 4INS 19 ......

This part of the file goes on for a while with additional reference information. This information is mainly for the user and is not needed by Amber. The next part of the file contains the amino acid sequence information and is shown below.

SEQRES 1 A 21 GLY ILE VAL GLU GLN CYS CYS THR SER ILE CYS SER LEU 4INS 170 11

SEQRES 2 A 21 TYR GLN LEU GLU ASN TYR CYS ASN 4INS 171 SEQRES 1 B 30 PHE VAL ASN GLN HIS LEU CYS GLY SER HIS LEU VAL GLU 4INS 172 SEQRES 2 B 30 ALA LEU TYR LEU VAL CYS GLY GLU ARG GLY PHE PHE TYR 4INS 173 SEQRES 3 B 30 THR PRO LYS ALA 4INS 174 SEQRES 1 C 21 GLY ILE VAL GLU GLN CYS CYS THR SER ILE CYS SER LEU 4INS 175 SEQRES 2 C 21 TYR GLN LEU GLU ASN TYR CYS ASN 4INS 176 SEQRES 1 D 30 PHE VAL ASN GLN HIS LEU CYS GLY SER HIS LEU VAL GLU 4INS 177 SEQRES 2 D 30 ALA LEU TYR LEU VAL CYS GLY GLU ARG GLY PHE PHE TYR 4INS 178 SEQRES 3 D 30 THR PRO LYS ALA 4INS 179

This information is needed by Amber. In some cases, the sequence information is modified. For example, it may be necessary to modify certain CYS residues to CYX. In the present case, this is necessary since the A and B chains are connected through S-S bonds (indicated by the CYX abbreviation). The location of the S-S bonds are found beginning on line 207.

SSBOND 1 CYS A 6 CYS A 11 4INS 207 SSBOND 2 CYS C 6 CYS C 11 4INS 208 SSBOND 3 CYS A 7 CYS B 7 4INS 209 SSBOND 4 CYS A 20 CYS B 19 4INS 210 SSBOND 5 CYS C 7 CYS D 7 4INS 211 SSBOND 6 CYS C 20 CYS D 19 4INS 212

Using this information, lines 170-179 have been modified (CYS->CYX) as shown below.

SEQRES 1 A 21 GLY ILE VAL GLU GLN CYX CYX THR SER ILE CYX SER LEU 4INS 170 SEQRES 2 A 21 TYR GLN LEU GLU ASN TYR CYX ASN 4INS 171 SEQRES 1 B 30 PHE VAL ASN GLN HIS LEU CYX GLY SER HIS LEU VAL GLU 4INS 172 SEQRES 2 B 30 ALA LEU TYR LEU VAL CYX GLY GLU ARG GLY PHE PHE TYR 4INS 173 SEQRES 3 B 30 THR PRO LYS ALA 4INS 174 SEQRES 1 C 21 GLY ILE VAL GLU GLN CYX CYX THR SER ILE CYX SER LEU 4INS 175 SEQRES 2 C 21 TYR GLN LEU GLU ASN TYR CYX ASN 4INS 176 SEQRES 1 D 30 PHE VAL ASN GLN HIS LEU CYX GLY SER HIS LEU VAL GLU 4INS 177 SEQRES 2 D 30 ALA LEU TYR LEU VAL CYX GLY GLU ARG GLY PHE PHE TYR 4INS 178 SEQRES 3 D 30 THR PRO LYS ALA 4INS 179

In addition, lines in the file that give the coordinates of CYS residues have to be modified to read CYX. This lines are: 272-283, 305-310, 383-388, 456-461, 547-552, 692-703, 725-730, 803-808, 872-877, and 962-967. To find these lines is not difficult. The file can be opened in jot and then use the search/replace option to correct. Finally, the connection of the CYS residues also has to be defined at the end of the file in the statements beginning with the key work CONECT (yes, conect, not connect). Some of the coordinate lines are shown below.

ATOM 1 N GLY A 1 -8.863 16.944 14.289 1.00 21.88 1 4INS 235 ATOM 2 CA GLY A 1 -9.929 17.026 13.244 1.00 22.85 1 4INS 236 ATOM 3 C GLY A 1 -10.051 15.625 12.618 1.00 43.92 1 4INS 237 ATOM 4 O GLY A 1 -9.782 14.728 13.407 1.00 25.22 1 4INS 238 ATOM 5 N ILE A 2 -10.333 15.531 11.332 1.00 26.28 1 4INS 239 ATOM 6 CA ILE A 2 -10.488 14.266 10.600 1.00 20.84 1 4INS 240 ATOM 7 C ILE A 2 -9.367 13.302 10.658 1.00 11.81 1 4INS 241 ATOM 8 O ILE A 2 -9.580 12.092 10.969 1.00 20.31 1 4INS 242 ATOM 9 CB ILE A 2 -10.883 14.493 9.095 1.00 40.00 1 4INS 243 ATOM 10 CG1 ILE A 2 -11.579 13.146 8.697 1.00 36.74 1 4INS 244

The CONECT lines are as follows: 12

CONECT 43 42 76 4INS1420 CONECT 49 48 227 4INS1421 CONECT 76 43 75 4INS1422 CONECT 154 153 318 4INS1423 CONECT 227 49 226 4INS1424 CONECT 318 154 317 4INS1425 CONECT 463 462 496 4INS1426 CONECT 469 468 643 4INS1427 CONECT 496 463 495 4INS1428 CONECT 574 573 733 4INS1429 CONECT 643 469 642 4INS1430 CONECT 733 574 732 4INS1431

The first six lines define the disulfide links with 43-76 being a S-S bond in the A-chain, 49-227 an S-S bond between the A and B chains, 154-318 a S-S bond between the A and B chains (and these connections are repeated in the opposite direction, e.g. 76-43, 227-49 and 318-154) in one of the two insulin molecules in the crystallographic unit. The second set of six are for the second insulin molecule present in the crystallographic unit.

Finally, depending upon your purpose, you may want to remove the water molecules and or Zn atoms. You may simply delete these from the file. The Zn atoms are located on lines 1068 and 1069. The waters start after that and run through to the CONECT lines.

Other modifications may be necessary and depend on your specific problem. The purpose here is simply to give you some insight into the .pdb file structure and how to modify them.

Resources for processing files - File format conversion: http://www.ahpcc.unm.edu/~chem/xmol/xmol.html X-Mol http://freeweb.interware.hu/frenzy/mol2mol/index.html Mol2Mol http://openbabel.sourceforge.net/ OpenBabel

Viewing with rasmol rasmol is one of many programs that can be used to view .pdb files. It has other capabilities, too, and users who are interested in finding out what can be done with rasmol should visit the rasmol homepage. Here, we will simply use it to visualize the 4INS.pdb file.

Unix Machine (e.g. Amber, Iris, or Frodo)

At the command prompt type:

rasmol

This will start the rasmol program. Note that if you add the filename of the pdb file you wish to view, it will be loaded. Otherwise click on file and then open. In the command window (which may be hidden 13

under the rasmol graphics window so you may need to move this window to see the command window) type 4INS.pdb (assuming your current directory is where this file is located) and it will open and be shown in the rasmol window. Explore the various options available (under Edit, Display, Colors, and Options). After looking at the 4INS.pdb file. Open the BD0004.pdb file and play with the same options as with the 4INS.pdb file.

You may not have noticed that something is missing from the insulin structure. The structure (and the original .pdb file) does not have the hydrogen atoms. These can be added to the file using the program protonate (part of the amber of programs). However, in this lab we will use a different program to do this (see below, xleap).

Linux

Use pymol. Simply open a terminal window and enter pymol. Note, the first time you do this you will need to edit your .bashrc file. In a terminal window in your home directory enter gedit, load .bashrc, and add the following line to the end of the file: source /usr/local/amberrc

Pymol is pretty simple to use just click File and Open and then locate the pdb you want to load. Note that this is a very powerful program for view and printing structures. I encourage you to have a look at the manual, available on-line.

On a PC

These instructions assume you have downloaded a copy of rasmol and have set it up to run on your machine. It also assumes you are using version 2.6.1 of rasmol. If you don’t have a copy of it, visit the rasmol home page, now, and download it.

Start up rasmol and click on ‘File/Open’ and then select the file you want to view. The file will be opened and, initially, you will get a stick structure. Experiment with the menu bar commands (esp File, Display, Options, and Export). Note, the main purpose of rasmol is for viewing structures. In addition, the PC version has some rudimentary printing capabilities and can generate bitmap and gif files. In either case (remote PC or Unix) you should be able to generate the picture shown below (shown with file modified so as to remove the water molecules). 14

Viewing .pdb files with Moilview

This is a much more powerful program and has many purposes besides just viewing. However, moilview only works if you are running on Amber, Frodo, Opal, or Iris. This program will be considered in detail in a later lab. However, for those interested, this program can be started by typing the command:

moilview

After clicking on the initial ‘advertisement box’ you get the first time you run moilview, right click in the moilview window and then select file and then Read 1st coordinates. Select either 4INS.pdb or the BD0004.pdb file from moilview’s file browser by clicking on it, making sure the file type is PDB, click OK, and answer yes to the question ‘Use distance?’. To move the molecule, left click and hold in the window, initially you will be in the ‘rotate’ mode. To change to either translate or size, select either T or S from the ‘toolbox’ in the upper right hand corner. If you again right click in the window to get a menu and now move to ‘Objects are’ and then pick ‘Spheres’ you will get a space fill version. Explore the menus as long as you like.. There is a manual in 1128 for this program - you’ll need it. Alternatively, you can visit the moilview homepage.

Other programs for visualizing structure files

Image Databases:

http://us.expasy.org/sw3d/ SWISS-3DIMAGE http://kinemage.biochem.duke.edu/kinemage/kinemage.php Kinemages

Processed Image viewers and Editors:

http://www.imagemagick.org/ Image Magick http://www.trilon.com/xv/ XV http://www.bmsc.washington.edu/raster3d/raster3d.html Raster3D http://www.chemicalgraphics.com/PovChem/ PovChem http://www.chemwindow.com/ ChemWeb

Basic Model Visualization Tools and Plug-ins:

http://www.umass.edu/microbio/rasmol RasMol http://pymol.sourceforge.net/ Pymol http://www.mdli.com/chemscape/chime/chime.html Chime http://rsb.info.nih.gov/nih-image/Default.html NIH-Image http://morita.chem.sunysb.edu/~carlos/moil-view.html Moilview http://www.ks.uiuc.edu/Research/vmd/ VMD/NAMD 15

OnLine Structure Visualization (Java/VRML-compliant Web Browser): http://www.ks.uiuc.edu/Development/jmv/ Java Molecular Viewer http://www.embl-heidelberg.de/cgi/viewer.pl WebMol http://cl.sdsc.edu/QuickPDB.html QuickPDB http://web.inc.bme.hu/~csonka/vrmlchem.html VRML in Chemistry

xleap - Setting up for Molecular Mechanics and Molecular Dynamics

The Universe Editor (adapted from http://amber.scripps.edu/)

Amber uses a program called xleap for a graphical display and editing of structures and for setting up files for calculations including molecular mechanics, molecular dynamics, free energy calculations, etc. The purpose in this lab is only to get acquainted with graphical display and editing of molecules using xleap. In particular, you will use xleap to view the two files you download. Note that, at the time of this writing, xleap does not work on the Linux machines.

If you are not already, login to Amber. Start xleap by typing the command

xleap

This command will start xleap. (Note that if you are logged in from a PC via a dial-up account, this will take a long time. Once started, things go more quickly, but not much. It is better to learn to use tleap (the text version of xleap. You can do everything in tleap that you can do in xleap except for displaying structures. For the latter purpose, FTP structures to your local machine and view them with rasmol. If you still want to just use xleap, be very patient) and run all of your stuff in the command mode. You will see a window open called the Universe Editor that looks similar to the one shown below. Some messages will appear which have to do with the loading of some parameter files as well as some information regarding paths. If you type help, once the command prompt (>) appears, you will generate the commands available to you in xleap. 16

If you type one of the commands shown on the list after you enter the word help, you will be given an explanation of the command as well as its format (sort of like the man command in unix).

Some of the commands can be accessed by clicking on File, Edit, Verbosity. The commands for loading and saving (1) off files, (2) pdb, and (3) amber prep files and the source command appear under File. Edit (either a unit or parmset) and impose are available under Edit. Verbosity is just the level of explanation you’ll get - especially regarding errors.

Note, the commands are shown on the help screen as a mix of lower and upper case letters. However, the world editor is not case sensitive and the use of lower/upper case letters is to divide the command into ‘words’. You can enter addPdbAtomMap or addpdbatommap, etc, and get the same result in each case.

The Unit Editor

Enter the following command:

edit DG

A second window will open - Graphical Unit Editor. You should see deoxyguanosine monophosphate in the Unit Editor. Translation of the structure is accomplished by pressing and holding the right mouse key and moving the mouse (the little hand shaped cursor needs to be in the Unit Editor window). Rotation is done the same way with the center mouse key. Holding both the center and right keys will allow you to zoom in and out.

The Unit button is used to check structures you build - they have to make chemical sense. This pull down menu will also do a crude minimization and can be used to add hydrogens to your structures.

The edit button allows you to select certain parts of a structure for editing/modification. It can also be used to blank certain parts of a molecule so that you can see what is of interest to you. This can be very useful when it comes to molecules that are surrounded by water molecules. 17

Finally, the Display button is used to turn on/off the names of atoms, the atom number, show the axes, etc.

In the rectangle headed manipulation are various ‘modes’ including select, twist, move, erase, draw. When you enter the unit editor you are in the select mode. To change, for example, to the erase mode, click the diamond before Erase. Note, the cursor changes to a little eraser - cute. Likewise, if you select draw, the cursor changes to a pencil/pen.

The Element box is used in conjunction with draw. If you select carbon and then draw, carbons are added. Highlight oxygen and when you draw, oxygens will be added.

Experiment with the various options and see what they do. If you lose the structure, or get stuck in some weird state and you can’t figure out how to escape, just exit the unit editor (under Unit). Then type edit DG again. This will get you back where you started.

The Parameter Editing Table

This is popped up by "Edit selected atoms", an item in the Selection pulldown of the Unit Editor, above. You will have to have selected either the entire structure, or some subset of atoms for this to work. The parameter editing table is used to modify atom definitions or add new ones to a force field. Using this table will be covered later in the course.

Viewing the protein and oligonucleotide structures

Exit the unit editor. Next, click on the File button, and move the cursor to load a pdb file. A new window will open. Find the file for the insulin molecule (or other protein) by moving to the directory where it is. Note, to go up one directory, click on ../ (just click and hold momentarily, and release). To go down one directory, do the same over the sub-directory you want to move to. Highlight the file you want to load AND in the field titled Variable (at the top of the window) enter a name (eg insulin) for what you are loading. This is basically an alias for the molecule and is more a matter of convenience 18 than any thing else. Finally, click on accept. You will see a bunch of messages as the file loads. When you get a > again, type edit insulin (or whatever you called the molecule). This will open the Unit Editor window and you should see the insulin along with some water molecules.

Repeat the above only retrieve your oligonucleotide.

Printing Structures

At various points in the process of conducting molecular mechanics and molecular dynamics, you will likely want to print out your work. Here will be described how to print out your work using some basic programs. The programs described here will not generally provide you with publication quality pictures but will provide useful working or draft hardcopies.

Printing from a PC

It is possible to print to either of the color printers located in the computational chemistry and molecular modeling lab from a PC. However, if you are physically located at a remote location, this does not do you much good. Here will be described how to print your work from your PC.

Rasmol

To print from this program is fairly straightforward. You just click file and print. You can also cut pictures from the viewing window and paste them directly into documents or into a graphic manipulation program (e.g. Paint).

Other PC programs

There are many programs available that can read .pdb files. A free program is WebLabviewer Lite. Programs that you must purchase include Alchemy (Tripos), Chem-Draw (Oxford), WebLabviewer (MSI) etc. These programs, in addition to printing, will allow the user to further manipulate structures (eg. they also do molecular mechanics, molecular dynamics, etc.). Note that pymol does a very nice job of displaying and printing structures.

Printing in the Molecular Modeling lab

Rasmol

To print from rasmol on a Unix machine is a little different than on a PC. First, load the molecule you want to print into rasmol and orient it as you wish as described earlier in this lab. Once you are happy 19 with the way the molecule looks, pick Export and then select PostScript. Then, in the Unix window (remember, this may be hidden behind the rasmol window and you may have to drag it a little to see the command line), enter a file name for the molecule. It is a good idea end your filename with .ps for a PostScript file, respectively. Now, either exit rasmol or open another unix shell. Prior to printing the file it is a good idea to preview the file to make sure you are printing what you really want. To do this enter:

xpsview filename.ps)

(Note that on some of the Unix machines you need to use gsview in place of xpsview. If you don’t like it, figure out how to add an alias to your .cshrc file.) Once you are sure that you want to print this file you can click on file then print. Or, you can exit from xpsview and enter:

lp filename.ps

(If you aren’t in the directory that contains filename.ps you will have to give the full path name to the file.)

Moilview

After starting moilview and loading the desired file (as described earlier in the lab) right click with in moilview’s window and hold. Move the cursor to Plots, PostScript/Normal/Ball-n-stick. You will be asked for a name. After responding (e.g. filename), moilview will write a file called filename.ps.

Now, either exit moilview or open another unix shell and enter:

lp filename.ps

(If you aren’t in the directory that contains filename.ps you will have to give the full path name to the file.)

Note, you can change the background color in moilview. This is desired as printing structures with a black background uses a lot of toner.

Exercises

1) Load the DNA molecule you obtained into rasmol. Change the display style to spacefilling. Change the background from black to white. Save the resulting structure as a postscript file. Print the file and turn in.

2) Load the Protein molecule you downloaded into moilview. Change the background to white. Turn off the waters. Display the protein in and stick rendering. Print the file and turn in.