Michael Wrzaczek Dept of Biosciences, Plant Biology Viikki Plant Science Centre (ViPS) University of Helsinki, Finland

November 11th, 2015 Why learn and use http://stein.cshl.org/genome_informatics/unix1/why_unix.html

Bioinformatics tools built on Unix

Why learn and use Unix

simple I/O system good programming tools building block approach to programming many text manipulation programmes easy networking (integrated networking) are lazy

Why learn and use Unix

Unix is very powerful

Many tools run on Unix

Majority of web services run on Unix What is Unix...

UNIX 70ies, Bell Laboratories

1969: UNIX in Assembler code ()

1972-1974: re-implementation in (Ken Thompson and )

Until 1979: UNIX source code free, and freely distributed

Early 80

The Unix-Philosophy

Write computer programmes so, that they do one thing, but they do that one thing well.

The Unix-Philosophy

Write programmes in a way, so that they can work together. The Unix-Philosophy

Write programmes in a way, that they use text files, since that is a univeral interface. (Douglas McIlroy) The Unix-Philosophy

Due to the of UNIX, ease of use/userfriendlyness was not one of the prime goals. UNIX was an operating system to develop

a system for geeks. The Unix-Philosophy

According to Eric S. Raymond Unix-Philosophies can be summed up as

The Unix-Philosophy

Mike Gancarz, 9 paramount precepts:

1. Small is beautiful 2. each programme do one thing well 3. Build a prototype as soon as possible 4. Choose portability over efficiency 5. Store data in flat text files 6. Use software leverage to your advantage 7. Use shell scripts to increase leverage and portability 8. Avoid captive user interfaces 9. Make every programme a

The Unix-Philosophy

10 lesser tenets (not universally agreed upon):

1. Allow the user to tailor the environment 2. Make operating system kernels small and lightweight 3. User lowercase and keep it short 4. Save trees 5. Silence is golden 6. Think parallel 7. The of the parts is greater than the whole 8. Look for the 90 percent solution 9. Worse is better (simplicity over perfection!) 10.Think hierarchically Unix makes it easy and efficient to customize

create from scratch

connect (workflow automation)

Since many flavours of Unix are open-source, it is very easy to take existing code/programmes and modify them according to the specific requirements.

(POSIX-compliant) AIX, IRIX, Solaris, Mac Os X

(not certified as POSIX-compliant) GNU, Minix, BSD,

Posix standard: standardized application level interface for Unix.

While AIX, IRIX, Solaris, Mac Os X are posix compliant, Linux and the BSDs are not Posix-certified but generally comply with the standards.

Plain and simple, the Posix standard makes it possible, to write a new operating system is fully compatible with other Unix systems, by sticking to these specifications. So, what is Linux?

kernel of the GNU/Linux operating system

operating system

-> distribution

-compatible free operating system. -user ready, Linux-kernel is used.

The kernel is the central component of computer operating

Systems. Amongst other things it regulates the communication

Between soft- and hardware and also between processes.

kernel of the GNU/Linux operating system

software required (to interact with the computer.)

Linux distributions

kernel

+ additional software

+ for simple software installation and removal

Note that also many things run Linux where you might not be aware of, eg. Routers, PDAs, etc

freely available on the net Pay for company-support Information: discussion forums newsgroups

The community is generally very helpful in solving

Some GNU/Linux Distributions: Ubuntu (http://www.ubuntulinux.org) Debian Gnu/Linux (http://www.debian.org) openSUSE (Novell SUSE Linux Desktop; http://www.opensuse.org) Fedora (Red Hat; http://www.fedoraproject.org) Linux (http://www.slackware.org) Gentoo GNU/Linux (http://www.gentoo.org) Mandriva Linux (formerly Mandrake; http://www.mandriva.com)

Which version of Linux is right for me?

Depends on use and taste.

Arch, Debian, Gentoo, Slackware, etc Suse, Redhat - Communication with Unix

You directly to the operating system

You to a programme, that communicates with Unix or communicates with another programme that communicates with Unix

What kinds of programmes can you talk to? a shell an interactive command a Graphical User Interface (GUI)

. That comes from the teletypes that were originally used to interact with the operating system

Note: Technically any software that allows the user to interact with a computer is a shell. A GUI is a shell just like a terminal. However, usually the command-line interpreter (or terminal) is referred to as shell. Communication with Unix

User

user commands interactive and data commands hand down input as well as output output prompt output Shell transfer adb built-in commands of control

requests for services

Kernel and Device Drivers The

PROGRAMMING LANGUAGE!

Shell scripts are not compiled (=translated into binaries, the machine understandable format) but interpreted runtime INTERPRETED LANGUAGE

Shell scripts are very powerful, but sophisticated alternatives exist. We will see and Python later in the course. The Unix shell

Different types

Thompson Shell (osh; the original UNIX shell), (sh), C-Shell (csh)

Korn Shell (ksh), Bourne Again Shell (), (zsh)

-in commands (eg. )

Which shell are you running?

michael@infocalypse ~ $ wrzaczek /etc/ michael:x:1007:100::/home/michael:/bin/bash michael@infocalypse ~ $ $SHELL /bin/bash michael@infocalypse ~ $ The Unix shell few helpful things

CASE SENSITIVE! trying to open a or using a command will not work if you mix upper- and lowercase letters. Commonly, all commands use lowercase letters while options and - or lower-case.

The Unix shell few helpful things

Tab completion typing the first few letters and then hitting the TAB key will complete the command/. If there are still several possibilities, when you press TAB a second you will be presented with a list of appropriate commands/.

Type and press TAB TAB. This will give you all commands starting with CO. Pathnames files and directories in a Unix filesystem

File extensions usually do not identify a file (there are a few exceptions!!!)

Access to files is based on the concept of users and groups

Every user has a unique account with a unique ID number (UID) Users are members of one or more groups

Users and groups can efficiently be used to restrict or allow access to files depending on requirements

wrzaczek@kasbi31 ~ $ cd / wrzaczek@kasbi31 / $ ls -l total 72 drwxr-xr-x 26 root root 4096 2007-11-09 13:20 home drwxr-xr-x 2 root root 4096 2008-01-29 10:10 bin drwxr-xr-x 2 root root 29 2005-06-09 14:43 boot drwxr-xr-x 14 root root 13180 2008-01- 31 16:09 dev drwxr-xr-x 84 root root 8192 2008-02-03 19:08 etc drwxr-xr-x 26 root root 4096 2007-11-09 13:20 home drwxr-xr-x 9 root root 8192 2008-01-29 10:10 lib drwxr-xr-x 2 root root 33 2008-01-29 12:24 media drwxr-xr-x 9 root root 106 2006-03-13 19:21 mnt drwxr-xr-x 13 root root 4096 2007-08-10 17:44 opt dr-xr-xr-x 93 root root 0 2008-01-31 16:08 proc drwx------17 root root 4096 2008-02-03 19:07 root drwxr-xr-x 2 root root 8192 2008-01-29 12:24 sbin drwxr-xr-x 11 root root 0 2008-01-31 16:08 sys drwxrwxrwt 6 root root 4096 2008-02-03 20:20 tmp drwxr-xr-x 14 root root 4096 2007-11-02 01:01 usr drwxr-xr-x 14 root root 4096 2006-09-15 20:40 var wrzaczek@kasbi31 / $

Inside a Unix system...

Enter the command h

Unix system harddisks

Partition harddisks will point to the assigned harddisk

is the attaching of an additional filesystem to the currently accessible filesystem of a computer.

You can tell your computer to add a new filesystem (a partition on a harddisk, a CD, a DVD, a USB stick etc) to the system and point to it from a folder on the already existing filesystem.

The command cd (change directory) moves you around the .

cd without any options takes you to your .

cd / takes you to the root level (=the most basal part of The filesystem)

NOTE: root can mean different things root: the account of the system root: the home directory of the root user root: the basal level on the filesystem indicated

When you log in to your linux machine you yourself in your HOME directory.

michael@infocalypse ~ $ cd / michael@infocalypse / $ ls -l total 72 drwxr-xr-x 2 root root 4096 2008-01-29 10:10 bin drwxr-xr-x 2 root root 29 2005-06-09 14:43 boot drwxr-xr-x 14 root root 13180 2008-01-31 16:09 dev cd / drwxr-xr-x 84 root root 8192 2008-02-03 19:08 etc drwxr-xr-x 26 root root 4096 2007-11-09 13:20 home to go to drwxr-xr-x 9 root root 8192 2008-01-29 10:10 lib the lowest level of the drwxr-xr-x 2 root root 33 2008-01-29 12:24 media drwxr-xr-x 9 root root 106 2006-03-13 19:21 mnt filesystem. drwxr-xr-x 13 root root 4096 2007-08-10 17:44 opt dr-xr-xr-x 93 root root 0 2008-01-31 16:08 proc drwx------17 root root 4096 2008-02-03 19:07 root Unix: drwxr-xr-x 2 root root 8192 2008-01-29 12:24 sbin drwxr-xr-x 11 root root 0 2008-01-31 16:08 sys single-root hierarchical drwxrwxrwt 6 root root 4096 2008-02-03 20:20 tmp . drwxr-xr-x 14 root root 4096 2007-11-02 01:01 usr drwxr-xr-x 14 root root 4096 2006-09-15 20:40 var michael@infocalypse / $ Devices (harddisks, cds, usb to directory locations (mountpoint) Some basic information

on who is using the system, who are you currently, and in which directory are you currently.

ls df On files, directories and filesystems/disks. ls lists the contents of a directory df displays free/used diskspace du estimates the diskusage of files

Try out the commands and check the man pages for useful

Options (eg l, -h, -s, --max-depth

The Unix shell few helpful things

Command history using the arrow-up button you can re-use the last couple of commands that you used before in your shell.

Press the arrow up key to see the previously used commands or use

history 10

To see eg the last 10 commands.

Your best friend: the man pages

To obtain any information about the precise use of a command and the available options use man

Most of the Unix commands have a well-written describing their function and usage.

man is your friend!

man man ls man man df

Some common Unix commands

The following is a list of commonly used LINUX/UNIX commands which may be of value during your Telnet sessions. Remember that LINUX/UNIX is case sensitive. Options or flags which can be used with a command are placed in [ ]. The [ ] are not part of the command and should not be included in the command that you type.

passwd changes your password logout logs you out of a Telnet session cd change directory; cd .. moves you backwards to the next higher subdirectory level; cd / moves you to the highest directory level permissions filenames changes the permissions for a file; permissions should include a letter designating who gets permissions ( u for the user, g for the group, o for others, or a for all) followed by a + or - (to give or take away the permission) followed by the kind of permission (r for read access, for write access, x for execute if the file is a program or ); the complete command that you type should look like: chmod g-w filename user:group filenames changes ownership of a file clears the screen cp oldfiles newfiles copies a file; this leaves the old file intact and makes a new copy with a new filename date tells you the current date and time df displays how much space on the disks (harddrive partitions) is free du [-a] [-s] directories tells you how much disk space your files occupy; the -a option displays the space used by each file, not just each directory; the -s option displays the total space used for each directory but not subdirectory provides online help; several topics have been included in the help system available on the servers

Some common Unix commands ls [-l] [-a] [-p] [-r] [-t] [-x] lists the files in a directory; -l displays detailed informtion about each file and directory, including persmissions, owners, size and time/date when the file was last modified; -a option displays all the files and subdirectories including hidden files (with names that begin with a ); -p displays a slash at the end of each directory name to distinquish them from filenames; -r displays files in reverse order; -t displays files in order of modification time; -x displays the filenames in columns across the screen man [-k keywords] topic displays the reference manual page about a LINUX command; the -k keywords option allows you to see all man pages that contain that keyword; topic is the command or topic which you want information about mesg [n|y] lets you control whether other people can use the talk command to interrupt you with on-screen messaging; mesg n will block the interruptions; mesg y will allow interruptions new_directory makes a new subdirectory with the name specified by new_directory [-i] oldname newname renames a file or moves it from one filename or directory to another; the -i option tells mv to prompt you before it replaces an existing filename passwd changes your password ping IP address or server sends a ping packet to another server; this provides information concerning the time it takes for information to make the round trip to the other computer; it will also tell you whether the other server is on-line at that time ps displays information about your processes/jobs/programs which are running on the server pwd shows the directory you are currently in ssh username@servername allows you to login to a remote ssh-server via en encrypted connection (nowadays almost always replaces telnet) Some common Unix commands [-i] [-r] filenames removes or deletes files; the -i option asks you to confirm that you want to delete each file; the -r option is dangerous because it allows you to delete an entire directory and all of the files it containsrmdir directoryremoves a directory; you can use the -i and -r options which are described in the rm command

[-r] [-lines] filename displays the last few lines of a file; -r displays the lines in reverse order; -lines specifies the number of lines, starting at the end of the file, you want to see

[-a] [-c] [-m] [date] filenames changes the date and time for a file without changing the content of the file; -a changes only the date and time the file was last accessed; -c doesn't create a file if it does not already exist; -m changes only the date and time the file was last modified; date specifies the date and time to give the file in the mmddhhnn format (month, day, hour, minute); touch with a new filename will create a new, empty file IP address or server alias provides information concerning the route which packets must take to get from your computer (the server in this case) to a remote computer/server; typically used to diagnose possible problems in packet routing vim Vim is a . Further information concerning the editing commands for VIm can be found in the help document. w provides information concerning who is logged into the system and some details on how they are connected who tells you who is using the server at that time write username sends a message to another person using the system; to prevent someone from writing to you, see the mesg n command

Becoming root

The command (substitute user) allows you to change your user identity (not only to root, but to any other user). su allows to to become root (the dash takes you

su alone would make you root but leave you in the current directory).

The passwd command allows you to change your password.

Root can use passwd to change/set the password of the specified user.

Using useradd , root can add a new user to the system. A common use would be eg. useradd -m s /bin/bash G users

What do the options m, -s and G do? Check the man page.

Check the man pages for the commands users groups groupadd to see what those commands do.

Manipulating files cp copies a file/directory from location A to location . Remember to use the recursive option to copy directories.

mv renames a file while creates .

What does the output of michael@infocalypse ~ $ cd / michael@infocalypse / $ ls -l ls l total 72 drwxr-xr-x 2 root root 4096 2008-01-29 10:10 bin tell you? drwxr-xr-x 2 root root 29 2005-06-09 14:43 boot drwxr-xr-x 14 root root 13180 2008-01-31 16:09 dev drwxr-xr-x 84 root root 8192 2008-02-03 19:08 etc drwxr-xr-x 26 root root 4096 2007-11-09 13:20 home drwxr-xr-x 9 root root 8192 2008-01-29 10:10 lib drwxr-xr-x 2 root root 33 2008-01-29 12:24 media The commands chown drwxr-xr-x 9 root root 106 2006-03-13 19:21 mnt drwxr-xr-x 13 root root 4096 2007-08-10 17:44 opt transfer ownership of dr-xr-xr-x 93 root root 0 2008-01-31 16:08 proc drwx------17 root root 4096 2008-02-03 19:07 root a file/directory while drwxr-xr-x 2 root root 8192 2008-01-29 12:24 sbin drwxr-xr-x 11 root root 0 2008-01-31 16:08 sys changes the drwxrwxrwt 6 root root 4096 2008-02-03 20:20 tmp chmod drwxr-xr-x 14 root root 4096 2007-11-02 01:01 usr permissions. drwxr-xr-x 14 root root 4096 2006-09-15 20:40 var michael@infocalypse / $

rm. Directories require the r option. also removes directories.

no trashcan

Be careful with the r option

Option i

Option f (force) overrides all error messages

Make sure you know what you delete.

What would happen if you do the following?

cd / rm rf *

THAT!!!!! About your system...

shows how long the machine has been running,

-a shows the version of the running kernel. ps can be used to view the running processes

ps A shows all processes and who they belong to

displays running processes dynamically in order consumed ressources

About your system...

It is possible with kill to send also different signals to processes (check the man page).

Useful commands

Useful for everyday work: , , and

sort is obvious, check the man pages for the options and possibilities.

diff finds differences between files cmp compares two files byte for byte comm compares two sorted files line for line

Unix commands can be put into scripts. But they can also

1 to the Following command. Example: ls l / | grep bin

the following command(s) will only be executed if the previous command was successfully executed.

Long output & how to save it

Sometimes commands will result in several pages of making it very impractical or in the worst case im- possible to review everything.

One possibility to solve this is, to pipe the output Through the pager , eg. ls l | less

Alternatively you can redirect the output of your Commands to a file instead of the screen. ls l >>

>> will append the output to the contents of a file (a non existing file will be created) while > will Overwrite the file!!! What you can and cannot do...

In a Unix system, the default user is not allowed to do many things...

As a user you are allowed to run specific programmes and manipulate all files that belong to the user.

Only the root user is allowed to do everything.

To allow average users to do certain things, Unix has groups. Eg. In order to allow a user to use the CD-drive, the user has to be a member of the group cdrom

File permissions are another important tool. File permissions can individually be set for the user, a certain group or everyone. More on the a little later. root

On Unix operating systems the first user account is called root.

root is the only user on a Unix machine with completely NON-restricted rights. This is needed for installation initially and is usually used later on for administration.

Since root can do anything, you do not work as root on a Unix machine unless it is really required. Use the programmes su or to temporarily become root!

If an attacker finds out your root password the machine is completely open to him.

On BSD systems (and a few others) an alternative root-account exists named toor. toor however, only has a minimal shell. grep ls l / | grep bin

Check the man page for grep and find out, what grep does.

What happens if you try the following? ls l / > ls.txt grep bin ls.txt

Sed is a streaming editor useful to modify files or input from other programmes. s

What is the difference in the output to just cat ls.txt?

What happens if you add the g modifier after the subsitution? sed ls.txt

How would you safe the output of this modification? cut allows you to extract columls of characters cut c1-3 ls.txt or columns identified by delimiters from a text file. cut -d ' ' -f1 ls.txt

$ cut -d':' -f1 /etc/passwd root bin sys sync games Bala

How would you get multiple columns?

Connecting to a remote machine

Very often working in a terminal on a Unix machine is done via a remote connection, so that no GUI is available.

Nowadays this is mostly done using ssh (the secure shell) via an encrypted connection. Earlier telnet was used (unsecure). ssh l (-p)

Many other options are available including tunneling other applications through an encrypted ssh connection and also forwarding of graphical applications to/from a remote machine when it is not allowed to do this directly.

From Windows computers, the Putty client is a very good tool to connect to Unix servers via ssh.

The command is used to copy files via an ssh connection between client and server. Where to get information?

All information is available freely on the net.

http://www.linux.org/lessons/beginner/index.html http://tldp.org/LDP/intro-linux/html/

As for books:

Unix Power Tools (Shelly Powers,

Offers a good overview over practically every topic. From very basic to quite advanced.

Where to go from here?

The best way to learn the command line is to USE it. Experiment with the commands and read the man pages. Putting it all together then you have everything you need to even write shell scripts.

The links below offer references and introduction tutorials and guides for scripting in the bash shell:

http://tldp.org/LDP/GNU-Linux-Tools-Summary/html/index.html

http://www.tuxfiles.org/linuxhelp/cli.html

http://gd.tuwien.ac.at/linuxcommand.org/

http://www.pixelbeat.org/cmdline.html

http://tldp.org/LDP/abs/html/

Michael Wrzaczek, [email protected]

Steven Levy: Hackers Heros of the Computer Revolution

Neal Stephenson: In the Beginning Was the Commandline