Michael Wrzaczek Dept of Biosciences, Plant Biology Viikki Plant Science Centre (ViPS) University of Helsinki, Finland
November 11th, 2015 Why learn and use Unix http://stein.cshl.org/genome_informatics/unix1/why_unix.html
Bioinformatics tools built on Unix
Why learn and use Unix
simple I/O system good programming tools building block approach to programming many text manipulation programmes easy networking (integrated networking) are lazy
Why learn and use Unix
Unix is very powerful
Many tools run on Unix
Majority of web services run on Unix What is Unix...
UNIX operating system 70ies, Bell Laboratories
1969: UNIX in Assembler code (Ken Thompson)
1972-1974: re-implementation in C (Ken Thompson and Dennis Ritchie)
Until 1979: UNIX source code free, and freely distributed
Early 80
The Unix-Philosophy
Write computer programmes so, that they do one thing, but they do that one thing well.
The Unix-Philosophy
Write programmes in a way, so that they can work together. The Unix-Philosophy
Write programmes in a way, that they use text files, since that is a univeral interface. (Douglas McIlroy) The Unix-Philosophy
Due to the history of UNIX, ease of use/userfriendlyness was not one of the prime goals. UNIX was an operating system to develop
a system for geeks. The Unix-Philosophy
According to Eric S. Raymond Unix-Philosophies can be summed up as
The Unix-Philosophy
Mike Gancarz, 9 paramount precepts:
1. Small is beautiful 2. Make each programme do one thing well 3. Build a prototype as soon as possible 4. Choose portability over efficiency 5. Store data in flat text files 6. Use software leverage to your advantage 7. Use shell scripts to increase leverage and portability 8. Avoid captive user interfaces 9. Make every programme a filter
The Unix-Philosophy
10 lesser tenets (not universally agreed upon):
1. Allow the user to tailor the environment 2. Make operating system kernels small and lightweight 3. User lowercase and keep it short 4. Save trees 5. Silence is golden 6. Think parallel 7. The sum of the parts is greater than the whole 8. Look for the 90 percent solution 9. Worse is better (simplicity over perfection!) 10.Think hierarchically Unix makes it easy and efficient to customize
create from scratch
connect (workflow automation)
Since many flavours of Unix are open-source, it is very easy to take existing code/programmes and modify them according to the specific requirements.
(POSIX-compliant) AIX, IRIX, Solaris, Mac Os X
(not certified as POSIX-compliant) GNU, Minix, BSD, Linux
Posix standard: standardized application level interface for Unix.
While AIX, IRIX, Solaris, Mac Os X are posix compliant, Linux and the BSDs are not Posix-certified but generally comply with the standards.
Plain and simple, the Posix standard makes it possible, to write a new operating system which is fully compatible with other Unix systems, by sticking to these specifications. So, what is Linux?
kernel of the GNU/Linux operating system
operating system
-> distribution
-compatible free operating system. -user ready, Linux-kernel is used.
The kernel is the central component of most computer operating
Systems. Amongst other things it regulates the communication
Between soft- and hardware and also between processes.
kernel of the GNU/Linux operating system
software required (to interact with the computer.)
Linux distributions
kernel
+ additional software
+ package manager for simple software installation and removal
Note that also many things run Linux where you might not be aware of, eg. Routers, PDAs, etc
freely available on the net Pay for company-support Information: discussion forums newsgroups
The community is generally very helpful in solving
Some GNU/Linux Distributions: Ubuntu (http://www.ubuntulinux.org) Debian Gnu/Linux (http://www.debian.org) openSUSE (Novell SUSE Linux Desktop; http://www.opensuse.org) Fedora (Red Hat; http://www.fedoraproject.org) Slackware Linux (http://www.slackware.org) Gentoo GNU/Linux (http://www.gentoo.org) Mandriva Linux (formerly Mandrake; http://www.mandriva.com)
Which version of Linux is right for me?
Depends on use and taste.
Arch, Debian, Gentoo, Slackware, etc Suse, Redhat - Communication with Unix
You talk directly to the operating system
You to a programme, that communicates with Unix or communicates with another programme that communicates with Unix
What kinds of programmes can you talk to? a shell an interactive command a Graphical User Interface (GUI)
tty . That comes from the teletypes that were originally used to interact with the operating system
Note: Technically any software that allows the user to interact with a computer is a shell. A GUI is a shell just like a terminal. However, usually the command-line interpreter (or terminal) is referred to as shell. Communication with Unix
User
user commands interactive and data commands hand down input as well as output output prompt output Shell transfer cat ls vi adb built-in commands of control
requests for services
Kernel and Device Drivers The Unix Shell
PROGRAMMING LANGUAGE!
Shell scripts are not compiled (=translated into binaries, the machine understandable format) but interpreted at runtime INTERPRETED LANGUAGE
Shell scripts are very powerful, but more sophisticated alternatives exist. We will see Perl and Python later in the course. The Unix shell
Different types
Thompson Shell (osh; the original UNIX shell), Bourne Shell (sh), C-Shell (csh)
Korn Shell (ksh), Bourne Again Shell (bash), Z Shell (zsh)
-in commands (eg. cd)
Which shell are you running?
michael@infocalypse ~ $ grep wrzaczek /etc/passwd michael:x:1007:100::/home/michael:/bin/bash michael@infocalypse ~ $ echo $SHELL /bin/bash michael@infocalypse ~ $ The Unix shell few helpful things
CASE SENSITIVE! trying to open a file or using a command will not work if you mix upper- and lowercase letters. Commonly, all commands use lowercase letters while options and - or lower-case.
The Unix shell few helpful things
Tab completion typing the first few letters and then hitting the TAB key will complete the command/filename. If there are still several possibilities, when you press TAB a second time you will be presented with a list of appropriate commands/filenames.
Type and press TAB TAB. This will give you all commands starting with CO. Pathnames locate files and directories in a Unix filesystem
File extensions usually do not identify a file (there are a few exceptions!!!)
Access to files is based on the concept of users and groups
Every user has a unique account with a unique ID number (UID) Users are members of one or more groups
Users and groups can efficiently be used to restrict or allow access to files depending on requirements
wrzaczek@kasbi31 ~ $ cd / wrzaczek@kasbi31 / $ ls -l total 72 drwxr-xr-x 26 root root 4096 2007-11-09 13:20 home drwxr-xr-x 2 root root 4096 2008-01-29 10:10 bin drwxr-xr-x 2 root root 29 2005-06-09 14:43 boot drwxr-xr-x 14 root root 13180 2008-01- 31 16:09 dev drwxr-xr-x 84 root root 8192 2008-02-03 19:08 etc drwxr-xr-x 26 root root 4096 2007-11-09 13:20 home drwxr-xr-x 9 root root 8192 2008-01-29 10:10 lib drwxr-xr-x 2 root root 33 2008-01-29 12:24 media drwxr-xr-x 9 root root 106 2006-03-13 19:21 mnt drwxr-xr-x 13 root root 4096 2007-08-10 17:44 opt dr-xr-xr-x 93 root root 0 2008-01-31 16:08 proc drwx------17 root root 4096 2008-02-03 19:07 root drwxr-xr-x 2 root root 8192 2008-01-29 12:24 sbin drwxr-xr-x 11 root root 0 2008-01-31 16:08 sys drwxrwxrwt 6 root root 4096 2008-02-03 20:20 tmp drwxr-xr-x 14 root root 4096 2007-11-02 01:01 usr drwxr-xr-x 14 root root 4096 2006-09-15 20:40 var wrzaczek@kasbi31 / $
Inside a Unix system...
Enter the command df h
Unix system harddisks
Partition harddisks mount Directory will point to the assigned harddisk
is the attaching of an additional filesystem to the currently accessible filesystem of a computer.
You can tell your computer to add a new filesystem (a partition on a harddisk, a CD, a DVD, a USB stick etc) to the system and point to it from a folder on the already existing filesystem.
The command cd (change directory) moves you around the tree.
cd without any options takes you to your home directory.
cd / takes you to the root level (=the most basal part of The filesystem)
NOTE: root can mean different things root: the superuser account of the system root: the home directory of the root user root: the basal level on the filesystem indicated
When you log in to your linux machine you find yourself in your HOME directory.
michael@infocalypse ~ $ cd / michael@infocalypse / $ ls -l total 72 Type drwxr-xr-x 2 root root 4096 2008-01-29 10:10 bin drwxr-xr-x 2 root root 29 2005-06-09 14:43 boot drwxr-xr-x 14 root root 13180 2008-01-31 16:09 dev cd / drwxr-xr-x 84 root root 8192 2008-02-03 19:08 etc drwxr-xr-x 26 root root 4096 2007-11-09 13:20 home to go to drwxr-xr-x 9 root root 8192 2008-01-29 10:10 lib the lowest level of the drwxr-xr-x 2 root root 33 2008-01-29 12:24 media drwxr-xr-x 9 root root 106 2006-03-13 19:21 mnt filesystem. drwxr-xr-x 13 root root 4096 2007-08-10 17:44 opt dr-xr-xr-x 93 root root 0 2008-01-31 16:08 proc drwx------17 root root 4096 2008-02-03 19:07 root Unix: drwxr-xr-x 2 root root 8192 2008-01-29 12:24 sbin drwxr-xr-x 11 root root 0 2008-01-31 16:08 sys single-root hierarchical drwxrwxrwt 6 root root 4096 2008-02-03 20:20 tmp file system. drwxr-xr-x 14 root root 4096 2007-11-02 01:01 usr drwxr-xr-x 14 root root 4096 2006-09-15 20:40 var michael@infocalypse / $ Devices (harddisks, cds, usb to directory locations (mountpoint) Some basic information
who whoami pwd on who is using the system, who are you currently, and in which directory are you currently.
ls df du On files, directories and filesystems/disks. ls lists the contents of a directory df displays free/used diskspace du estimates the diskusage of files
Try out the commands and check the man pages for useful
Options (eg l, -h, -s, --max-depth
The Unix shell few helpful things
Command history using the arrow-up button you can re-use the last couple of commands that you used before in your shell.
Press the arrow up key to see the previously used commands or use
history 10
To see eg the last 10 commands.
Your best friend: the man pages
To obtain any information about the precise use of a command and the available options use man
Most of the Unix commands have a well-written man page describing their function and usage.
man is your friend!
Some common Unix commands
The following is a list of commonly used LINUX/UNIX commands which may be of value during your Telnet sessions. Remember that LINUX/UNIX is case sensitive. Options or flags which can be used with a command are placed in [ ]. The [ ] are not part of the command and should not be included in the command that you type.
passwd changes your password logout logs you out of a Telnet session cd change directory; cd .. moves you backwards to the next higher subdirectory level; cd / moves you to the highest directory level chmod permissions filenames changes the permissions for a file; permissions should include a letter designating who gets permissions ( u for the user, g for the group, o for others, or a for all) followed by a + or - (to give or take away the permission) followed by the kind of permission (r for read access, w for write access, x for execute if the file is a program or script); the complete command that you type should look like: chmod g-w filename chown user:group filenames changes ownership of a file clear clears the screen cp oldfiles newfiles copies a file; this leaves the old file intact and makes a new copy with a new filename date tells you the current date and time df displays how much space on the disks (harddrive partitions) is free du [-a] [-s] directories tells you how much disk space your files occupy; the -a option displays the space used by each file, not just each directory; the -s option displays the total space used for each directory but not subdirectory help provides online help; several topics have been included in the help system available on the servers
Some common Unix commands ls [-l] [-a] [-p] [-r] [-t] [-x] lists the files in a directory; -l displays detailed informtion about each file and directory, including persmissions, owners, size and time/date when the file was last modified; -a option displays all the files and subdirectories including hidden files (with names that begin with a dot); -p displays a slash at the end of each directory name to distinquish them from filenames; -r displays files in reverse order; -t displays files in order of modification time; -x displays the filenames in columns across the screen man [-k keywords] topic displays the reference manual page about a LINUX command; the -k keywords option allows you to see all man pages that contain that keyword; topic is the command or topic which you want information about mesg [n|y] lets you control whether other people can use the talk command to interrupt you with on-screen messaging; mesg n will block the interruptions; mesg y will allow interruptions mkdir new_directory makes a new subdirectory with the name specified by new_directory mv [-i] oldname newname renames a file or moves it from one filename or directory to another; the -i option tells mv to prompt you before it replaces an existing filename passwd changes your password ping IP address or server alias sends a ping packet to another server; this provides information concerning the time it takes for information to make the round trip to the other computer; it will also tell you whether the other server is on-line at that time ps displays information about your processes/jobs/programs which are running on the server pwd shows the directory you are currently in ssh username@servername allows you to login to a remote ssh-server via en encrypted connection (nowadays almost always replaces telnet) Some common Unix commands rm [-i] [-r] filenames removes or deletes files; the -i option asks you to confirm that you want to delete each file; the -r option is dangerous because it allows you to delete an entire directory and all of the files it containsrmdir directoryremoves a directory; you can use the -i and -r options which are described in the rm command
tail [-r] [-lines] filename displays the last few lines of a file; -r displays the lines in reverse order; -lines specifies the number of lines, starting at the end of the file, you want to see
touch [-a] [-c] [-m] [date] filenames changes the date and time for a file without changing the content of the file; -a changes only the date and time the file was last accessed; -c doesn't create a file if it does not already exist; -m changes only the date and time the file was last modified; date specifies the date and time to give the file in the mmddhhnn format (month, day, hour, minute); touch with a new filename will create a new, empty file traceroute IP address or server alias provides information concerning the route which packets must take to get from your computer (the server in this case) to a remote computer/server; typically used to diagnose possible problems in packet routing vim Vim is a text editor. Further information concerning the editing commands for VIm can be found in the help document. w provides information concerning who is logged into the system and some details on how they are connected who tells you who is using the server at that time write username sends a message to another person using the system; to prevent someone from writing to you, see the mesg n command
Becoming root
The command su (substitute user) allows you to change your user identity (not only to root, but to any other user). su allows to to become root (the dash takes you
su alone would make you root but leave you in the current directory).
The passwd command allows you to change your password.
Root can use passwd
Using useradd , root can add a new user to the system. A common use would be eg. useradd
What do the options m, -s and G do? Check the man page.
Check the man pages for the commands users groups groupadd to see what those commands do.
Manipulating files cp copies a file/directory from location A to location B. Remember to use the recursive option to copy directories.
mv renames a file while ln creates link.
What does the output of michael@infocalypse ~ $ cd / michael@infocalypse / $ ls -l ls l total 72 drwxr-xr-x 2 root root 4096 2008-01-29 10:10 bin tell you? drwxr-xr-x 2 root root 29 2005-06-09 14:43 boot drwxr-xr-x 14 root root 13180 2008-01-31 16:09 dev drwxr-xr-x 84 root root 8192 2008-02-03 19:08 etc drwxr-xr-x 26 root root 4096 2007-11-09 13:20 home drwxr-xr-x 9 root root 8192 2008-01-29 10:10 lib drwxr-xr-x 2 root root 33 2008-01-29 12:24 media The commands chown drwxr-xr-x 9 root root 106 2006-03-13 19:21 mnt drwxr-xr-x 13 root root 4096 2007-08-10 17:44 opt transfer ownership of dr-xr-xr-x 93 root root 0 2008-01-31 16:08 proc drwx------17 root root 4096 2008-02-03 19:07 root a file/directory while drwxr-xr-x 2 root root 8192 2008-01-29 12:24 sbin drwxr-xr-x 11 root root 0 2008-01-31 16:08 sys changes the drwxrwxrwt 6 root root 4096 2008-02-03 20:20 tmp chmod drwxr-xr-x 14 root root 4096 2007-11-02 01:01 usr permissions. drwxr-xr-x 14 root root 4096 2006-09-15 20:40 var michael@infocalypse / $
rm. Directories require the r option. rmdir also removes directories.
no trashcan
Be careful with the r option
Option i
Option f (force) overrides all error messages
Make sure you know what you delete.
What would happen if you do the following?
cd / rm rf *
THAT!!!!! About your system...
uptime shows how long the machine has been running,
uname -a shows the version of the running kernel. ps can be used to view the running processes
ps A shows all processes and who they belong to
top displays running processes dynamically in order consumed ressources
About your system...
It is possible with kill to send also different signals to processes (check the man page).
Useful commands
Useful for everyday work: sort, cmp, diff and comm
sort is obvious, check the man pages for the options and possibilities.
diff finds differences between files cmp compares two files byte for byte comm compares two sorted files line for line
Unix commands can be put into scripts. But they can also
1 to the Following command. Example: ls l / | grep bin
the following command(s) will only be executed if the previous command was successfully executed.
Long output & how to save it
Sometimes commands will result in several pages of making it very impractical or in the worst case im- possible to review everything.
One possibility to solve this is, to pipe the output Through the pager less, eg. ls l | less
Alternatively you can redirect the output of your Commands to a file instead of the screen. ls l >>
>> will append the output to the contents of a file (a non existing file will be created) while > will Overwrite the file!!! What you can and cannot do...
In a Unix system, the default user is not allowed to do many things...
As a user you are allowed to run specific programmes and manipulate all files that belong to the user.
Only the root user is allowed to do everything.
To allow average users to do certain things, Unix has groups. Eg. In order to allow a user to use the CD-drive, the user has to be a member of the group cdrom
File permissions are another important tool. File permissions can individually be set for the user, a certain group or everyone. More on the a little later. root
On Unix operating systems the first user account is called root.
root is the only user on a Unix machine with completely NON-restricted rights. This is needed for installation initially and is usually used later on for administration.
Since root can do anything, you do not work as root on a Unix machine unless it is really required. Use the programmes su or sudo to temporarily become root!
If an attacker finds out your root password the machine is completely open to him.
On BSD systems (and a few others) an alternative root-account exists named toor. toor however, only has a minimal shell. grep ls l / | grep bin
Check the man page for grep and find out, what grep does.
What happens if you try the following? ls l / > ls.txt grep bin ls.txt
Sed is a streaming editor useful to modify files or input from other programmes. s
What is the difference in the output to just cat ls.txt?
What happens if you add the g modifier after the subsitution? sed ls.txt
How would you safe the output of this modification? cut cut allows you to extract columls of characters cut c1-3 ls.txt or columns identified by delimiters from a text file. cut -d ' ' -f1 ls.txt
$ cut -d':' -f1 /etc/passwd root daemon bin sys sync games Bala
How would you get multiple columns?
Connecting to a remote machine
Very often working in a terminal on a Unix machine is done via a remote connection, so that no GUI is available.
Nowadays this is mostly done using ssh (the secure shell) via an encrypted connection. Earlier telnet was used (unsecure). ssh l
Many other options are available including tunneling other applications through an encrypted ssh connection and also forwarding of graphical applications to/from a remote machine when it is not allowed to do this directly.
From Windows computers, the Putty client is a very good tool to connect to Unix servers via ssh.
The command is used to copy files via an ssh connection between client and server. Where to get information?
All information is available freely on the net.
http://www.linux.org/lessons/beginner/index.html http://tldp.org/LDP/intro-linux/html/
As for books:
Unix Power Tools (Shelly Powers,
Offers a good overview over practically every topic. From very basic to quite advanced.
Where to go from here?
The best way to learn the command line is to USE it. Experiment with the commands and read the man pages. Putting it all together then you have everything you need to even write shell scripts.
The links below offer references and introduction tutorials and guides for scripting in the bash shell:
http://tldp.org/LDP/GNU-Linux-Tools-Summary/html/index.html
http://www.tuxfiles.org/linuxhelp/cli.html
http://gd.tuwien.ac.at/linuxcommand.org/
http://www.pixelbeat.org/cmdline.html
http://tldp.org/LDP/abs/html/
Michael Wrzaczek, [email protected]
Steven Levy: Hackers Heros of the Computer Revolution
Neal Stephenson: In the Beginning Was the Commandline