Basics of UNIX

Basics of UNIX August 23, 2012 By UNIX, I mean any UNIX-like operating system, including Linux and Mac OS X. On the Mac you can access a UNIX terminal window with the Terminal application (under Applica- tions/Utilities). Most modern scientific computing is done on UNIX-based machines, often by remotely logging in to a UNIX-based server. 1 Connecting to a UNIX machine from {UNIX, Mac, Windows} See the file on bspace on connecting remotely to SCF. In addition, this SCF help page has information on logging in to remote machines via ssh without having to type your password every time. This can save a lot of time. 2 Getting help from SCF More generally, the department computing FAQs is the place to go for answers to questions about SCF. For questions not answered there, the SCF requests: “please report any problems regarding equipment or system software to the SCF staff by sending mail to ’trouble’ or by reporting the problem directly to room 498/499. For information/questions on the use of application packages (e.g., R, SAS, Matlab), programming languages and libraries send mail to ’consult’. Questions/problems regarding accounts should be sent to ’manager’.” Note that for the purpose of this class, questions about application packages, languages, libraries, etc. can be directed to me. 1 3 Files and directories 1. Files are stored in directories (aka folders) that are in a (inverted) directory tree, with “/” as the root of the tree 2. Where am I? > pwd 3. What’s in a directory? > ls > ls -a > ls -al 4. Moving around > cd /home/paciorek/teaching/243 > cd ~paciorek/teaching/243 > cd ~/teaching/243 > cd 243 # provided I am in ’teaching’ > cd .. > cd - 5. Copying and removing > cp > cp -r > cp -rp # preserves timestamps and other metainfo (VERY handy for tracing your workflows if you move files between machines) > mkdir > rm > rm -r > rm -rf # CAREFUL! To copy between machines, we can use scp, which has similar options to cp: > scp file.txt [email protected]:~/research/. > scp [email protected]:/data/file.txt ~/research/renamed.txt 2 6. File permissions: ls -al will show the permissions for the ’user’, ’group’, ’other’ • to allow a file to be executed as a program: > chmod ugo+x myProg # myProg should be compiled code or a shell script • to allow read and write access to all: > chmod ugo+rw code.q • to prevent write access: > chmod go-w myThesisCode.q 7. Compressing files • the zip utility compresses in a format compatible with zip files for Windows: > zip files.zip a.txt b.txt c.txt • gzip is the standard in UNIX: > gzip a.txt b.txt c.txt # will create a.txt.gz, b.txt.gz, c.txt.gz • tar will nicely wrap up entire directories: > tar -cvf files.tar myDirectory > tar -cvzf files.tgz myDirectory • To unwrap a tarball > tar -xvf files.tar > tar -xvzf files.tgz • To peek into a zipped (text) file: > gzip -cd file.gz | less > zcat file.zip | less 4 A variety of UNIX tools/capabilities Many UNIX programs are small programs (tools, utilities) that can be combined to do complicated things. 1. For help on a UNIX program, including command-line utilities like ls, cp, etc. > man cp 3 2. What’s the path of an executable? > which R 3. Tools for remotely mounting the filesystem of a remote UNIX machine/filesystem as a ’local’ directory on your machine: • Samba protocol - see “How can I mount my home directory” on SCF Help Desk FAQs • MacFUSE • Linux: > cd; mkdir nersc # create a directory as a mountpoint > sshfs carver.nersc.gov: /Users/paciorek/nersc # mount the remote filesystem > fusermount -u ~/scf # to unmount 4. Cloud storage: Dropbox and other services will mirror directories on multiple machines and on their servers 5. To do something at the UNIX command line from within R, use the system() function in R: > system(“ls -al”) 6. Editors For statistical computing, we need an editor, not a word processor, because we’re going to be operating on files of code and data files, for which word processing formatting gets in the way. • traditional UNIX: emacs, vi • Windows: WinEdt • Mac: Aquamacs Emacs, TextMate, TextEdit • Be careful in Windows - file suffixes are often hidden 7. Basic emacs: • emacs has special modes for different types of files: R code files, C code files, Latex files – it’s worth your time to figure out how to set this up on your machine for the kinds of files you often work on 4 Table 1. Helpful emacs control sequences. Sequence Result C-x,C-c Close the file C-x,C-s Save the file C-x,C-w Save with a new name C-s Search ESC Get out of command buffer at bottom of screen C-a Go to beginning of line C-e Go to end of line C-k Delete the rest of the line from cursor forward C-space, then move to end of block Highlight a block of text C-w Remove the highlighted block, putting it in the kill buffer C-y (after using C-k or C-w) Paste from kill buffer (’y’ is for ’yank’) – For working with R, ESS (emacs speaks statistics) mode is helpful. This is built in to Aquamacs emacs. Alternatively, the Windows and Mac versions of R, as well as RStudio (available for all platforms) provide a GUI with a built-in editor. • To open emacs in the terminal window rather than as a new window, which is handy when it’s too slow (or impossible) to tunnel the graphical emacs window through ssh: > emacs -nw file.txt 8. Files that provide info about a UNIX machine: • /proc/meminfo • /proc/cpuinfo • /etc/issue • Example: how do I find out how many processors a machine has: > grep processor /proc/cpuinfo 9. There are (free) tools in UNIX to convert files between lots of formats (pdf, ps, html, latex, jpg). This is particularly handy when preparing figures for a publication. My computing tips page lists a number of these. 5 The bash shell and UNIX utilities August 26, 2012 Sections with more advanced material that are not critical for those of you just getting started with UNIX are denoted (***). However, I will ask you to write a shell function on the first problem set. Note that it can be difficult to distinguish what is shell-specific and what is just part of UNIX. Some of the material here is not bash-specific but general to UNIX. Reference: Newham and Rosenblatt, Learning the bash Shell, 2nd ed. 1 Shell basics The shell is the interface between you and the UNIX operating system. When you are working in a terminal window (i.e., a window with the command line interface), you’re interacting with a shell. There are multiple shells (sh, bash, csh, tcsh, ksh). We’ll assume usage of bash, as this is the default for Mac OSX and on the SCF machines and is very common for Linux. 1. What shell am I using? > echo $SHELL 2. To change to bash on a one-time basis: > bash 3. To make it your default: > chsh /bin/bash /bin/bash should be whatever the path to the bash shell is, which you can figure out using which bash Shell commands can be saved in a file (with extension .sh) and this file can be executed as if it were a program. To run a shell script called file.sh, you would type ./file.sh. Note that if you just typed file.sh, the shell will generally have trouble finding the script and recognizing that it is executable. 1 2 Tab completion When working in the shell, it is often unnecessary to type out an entire command or file name, because of a feature known as tab completion. When you are entering a command or filename in the shell, you can, at any time, hit the tab key, and the shell will try to figure out how to complete the name of the command or filename you are typing. If there is only one command in the search path and you’re using tab completion with the first token of a line, then the shell will display its value and the cursor will be one space past the completed name. If there are multiple commands that match the partial name, the shell will display as much as it can. In this case, hitting tab twice will display a list of choices, and redisplay the partial command line for further editing. Similar behavior with regard to filenames occurs when tab completion is used on anything other than the first token of a command. Note that R does tab completion for objects (including functions) and filenames. 3 Command history By using the up and down arrows, you can scroll through commands that you have entered previ- ously. So if you want to rerun the same command, or fix a typo in a command you entered, just scroll up to it and hit enter to run it or edit the line and then hit enter. Note that you can use emacs-like control sequences (C-a, C-e, C-k) to navigate and delete characters, just as you can at the prompt in the shell usually. You can also rerun previous commands as follows: > !-n # runs the nth previous command > !xt # runs the last command that started with ’xt’ If you’re not sure what command you’re going to recall, you can append :p at the end of the text you type to do the recall, and the result will be printed, but not executed.

Basics of UNIX

Unix Introduction

Administering Unidata on UNIX Platforms

Introduction to Command Line and Accessing Servers Remotely

Types and Programming Languages by Benjamin C

LATEX for Beginners

Windows Command Prompt Cheatsheet

The Gnu Binary Utilities

Creating Rpms Guide

Download the Specification

A Large-Scale Study of File-System Contents

Unix/Linux Command Reference

Your Performance Task Summary Explanation