
Appendix I: Brief Unix reference From now on, we assume that you have access to a Unix terminal window. We will be focusing on operations that are possible at the command line, as this is where you are most effective in the long run. Operating systems File system Whenever you are using a computer you interact with it with the help of an operating system (OS), a vital interface between hardware and user. The operating system does a Examples of directories and their contents number of different things. For instance, multiple programs are often run at the same time and in this situation the operating system allocates resources to the different programs or / root of file system may be able to appropriately interrupt programs. Another common feature of an /bin executable binary files operating system is a graphical user interface, originally developed for personal /dev special files used to represent real physical devices computers. Examples of popular operating systems are Microsoft Windows, Mac OSX /etc commands and files used for system administration and Linux. /home contains a home directory for each user of the system /home/joe home directory of user joe Linux is an example of a Unix (or "Unix-like") operating system. Unix was originally /lib libraries used by various programs and programming languages developed in 1969 at Bell Laboratories in the US. Many different flavours of the Unix a "scratch" area where any user can store files temporarily operating systems have been developed, such as Solaris, HP-UX and AIX and there is a /tmp system files and directories that you share with other users number of freely available Unix or Unix-like systems such as GNU/Linux in different /usr distributions such as Red Hat Enterprise Linux, Fedora, SUSE Linux Enterprise, openSUSE and Ubuntu. Moving around in the file system If you are at a personal computer your access to a Unix depends on what operating system you are using. For instance, Microsoft Windows is not based on Unix and does Find out what directory you are currently in: not provide a Unix interface. If you would like to have a Unix environment within Windows a possible choice is to install Cygwin (http://www.cygwin.com). Cygwin has a % pwd "present working directory" lot of Unix functionality and useful Unix programs and it also includes Perl (see also the Appendix III). The Mac OS X operating system is based on Unix and all you need to do Move to a specific directory with cd, "change directory": to communicate with Unix commands is to open a terminal window (available under Applications-Utilities). Same thing if you are at a computer with Linux or any other Unix % cd /tmp change directory to /tmp flavour; just open a terminal window to get started. % cd .. go up one level in directory tree % cd (without argument) go to your home directory Accessing a UNIX computer Find out what files are in the current directory: Even though you do not have Unix at your personal computer, you can connect to a Unix-based server and operate from there. In fact, this is a common mode of working in % ls show files in current directory bioinformatics. You typically make use of an SSH (Secure Shell) client program to % ls -al show files in current directory connect. SSH is a network protocol to allow data to be transferred between two networked computers. The SSH client program communicates with a SSH daemon The ls -al command will result in a more detailed output, as in this example: running on the server side. A typical application of the SSH client program is to login to a remote computer and execute commands at the remote computer. Examples of freely -rw-r--r-- 1 joe users 383269 2007-11-25 16:54 PF02854.txt drwxr-xr-x 9 joe users 656 2003-04-04 20:12 scripts/ available implementations of SSH are openSSH (http://www.openssh.com/), copSSH -rw-r--r-- 1 joe users 4898 2006-09-12 09:12 README.txt (http://www.itefix.no/i2/copssh) and PuTTY -rwxr-xr-x 1 joe users 120635 2004-08-03 01:47 dnapars* (http://www.chiark.greenend.org.uk/~sgtatham/putty/). In this listing there are in each line a set of characters describing the file status. In a string Removing files like -rw-r--r--, the first '- ' means that it is a regular file ( a 'd' says that it is a directory). The next symbols are three groups of three where the first is what the owner % rm seq.fa can do, second what the group members can do and third what the other users can do. In each group of three, the symbols are 'r' = readable, 'w' = writable, 'x' = executable. To Creating and removing directories change the file attributes, see documentation on the Unix command chmod. % mkdir dirname As with many other Unix commands, the wildcard * may be used with ls. The following % rmdir dirname command will list files with names beginning with 'HIV': Viewing and editing files % ls HIV* % cat seq.fa (will show the contents of seq.fa on the screen) Manipulating Files and Directories The cat command may also be used to merge (concatenate) files. This command will merge three different files, file1, file2 and file3 into a new file newfile: Copying files % cat file1 file2 file3 > newfile cp [source] [destination] The symbol > means that we are redirecting the output of the cat program to a file Examples: instead of the standard output (which is the screen). We can also append to an existing file, using the symbols >>: 1) Copy the file /tmp/seq.fa to the current working directory. The current directory is represented by a dot '.': % cat file4 >> newfile % cp /tmp/seq.fa . Viewing a text file on the screen one page at a time. 2) Copy the file /tmp/seq.fa to another file in /tmp named seq2.fa: To view a text file on the screen, use either more or less. With less: % cp /tmp/seq.fa /tmp/seq2.fa % less seq.fa 3) Copy all files (*) in the directory /home/joe/seqfiles to the directory /tmp: Some useful keys for less are: % cp /home/joe/seqfiles/* /tmp space : move down one page Moving and renaming files enter : go down one line u : go up (back) mv [source] [destination] /HIV : search for 'HIV' q : quit program Examples: Viewing or extracting the first or last lines of a file 1) Rename the file seq.fa in /tmp to seq2.fa: % mv /tmp/seq.fa /tmp/seq2.fa First and last lines of a file may be displayed using the commands head and tail, respectively. 2) Move all files (*) in the directory /home/joe/seqfiles to the directory /tmp: % head seq.fa (by default head will show the first 10 lines of the file) % mv /home/joe/seqfiles/* /tmp % head -1000 seq.fa (first 1000 lines of file will be extracted) % cut -f1,2 -d ';' dat2.txt Text-mode editor vi The output will be: % vi seq.fa A;2 B;5 The editor vi is very useful whenever you do not have access to a graphical editor. F;4 Description of it, however, requires a book on its own. For more information on vi the reader is referred to other sources such as: Sorting http://en.wikibooks.org/wiki/Learning_the_vi_Editor http://unixhelp.ed.ac.uk/vi/ Consider the file dat3.txt that contains: Graphical editors A 12 1300 1306 C 11 1500 1458 B 17 1620 1700 Examples of graphical editors are emacs (http://www.gnu.org/software/emacs/), gedit (http://www.gedit.org) and nedit (http://www.nedit.org). The lines may be sorted using sort: % sort dat3.txt Extracting file components with cut The output will be: Consider the content of a file dat.txt where the columns are separated with tabs: A 12 1300 1306 1 12 1300 1306 B 17 1620 1700 2 11 1500 1458 C 11 1500 1458 3 17 1620 1700 The sort utility sorts lines alphabetically by default. Sorting is done numerically if we use the option -n. In addition, we may specify sorting with respect to a specific column, We may extract the columns 1 and 3 with cut: using the parameter -k: % cut -f1,3 dat.txt % sort -n -k2 dat3.txt which produces: The output is then: 1 1300 C 11 1500 1458 2 1500 A 12 1300 1306 3 1620 B 17 1620 1700 The fields or columns to be extracted are specified with the -f option. The default Note that now the values in column 2 are in numerical order. We may also reverse the separator is tab, but we may use any separator. The separator is specified with the -d order of sorting with the -r parameter: option to cut. Consider the file dat2.txt which contains: % sort -n -k2 -r dat3.txt A;2;4500 B;5;4505 F;4;4510 Unique lines We try the cut command: The uniq command is used to identify the unique lines in a file: A highly useful Unix utility is the grep command, used to search files for text strings or % uniq sortedfile regular expression matching: For this to work well the lines in the file need to be sorted first with sort. A useful option % grep ">" seq.fa | wc to uniq is -c . The effect is to list the number of times each line occurs: In this example grep will identify all lines in seq.fa that contain '>'. The output of grep % uniq -c sortedfile will then be directed to wc.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages5 Page
-
File Size-