up and Shutting down A primer for troubleshooting

In this section, we touch upon the startup and shutdown process on . It is beyond the scope of this course to cover this topic in depth and we highly recommend that you read more about this topic in the official RedHat documentation or the documents listed at the end of this section. There is an abundance of information on this topic available in books and on the web.

The Boot Up Process

Please refer to the official RedHat documentation for details about the boot up process. http://www.redhat.com/docs/manuals/linux/RHL-9-Manual/ref-guide/ch-boot--shutdown.html Bootloaders

Two common bootloaders in Linux are Lilo and grub. Bio-Linux currently uses grub.

Grub installs a boot loader to the Master Boot Record. By configuring grub, you can put specific instructions in the Master Boot Record that allow you to load menus or command environments such that you could choose to start up different operating systems, pass information to the kernel, etc.

The grub configuration file is at /boot/grub/grub.conf. Unless you are really into system work, you will probably not ever have to touch this file.

More information on grub can be found at: http://www.redhat.com/docs/manuals/linux/RHL-9-Manual/ref-guide/s1-grub-whatis.html

Once grub has done its job, it hands over control of the machine to the .

Run levels

Run Level Description 0 Halt 1 Single user mode 2 Multiuser, without networking (without NFS) 3 Full multiuser mode, with networking 4 Unused 5 the same as 3, but with X11 started automatically 6 Reboot

The first process run after the kernel has loaded is /sbin/init – this runs with pid 1.

The default run level on your system is 5.

0 and 6 are used for shutdown and reboot of the machine respectively.

/sbin/init 0 is the same as running shutdown –h now

/sbin/init 6 is the same as running shutdown –r now

If neither the –h or –r options are given to the shutdown command, the default it to reboot to run level 1: single user mode. This is the runlevel used when troubleshooting system problems. To boot to single user mode, you can also use the command:

/sbin/init 1

This can be run using sudo or run as root.

/etc/inittab init reads the file /etc/inittab to determine what to do and when to do it.

The default run level for your system is defined in this file.

Entries in the /etc/inittab file take the form: id:runlevels:action:process

For more information on this, please refer to the Boot Process section of the The CTDP Linux Startup Manual (link at the bottom of document).

The series of lines where the action is contain the script name: /etc/rc.d/rc. This script takes the argument at the end of the line, (0, 1, 2, etc.), which correspond to a run level. Commands in this file cause the execution of “kill” scripts, (those that start with the letter K in the appropriate directory) to run, and then runs the startup scripts, (those that start with the letter S). This is done for each run-level, (as can be seen by the series of commands in the /etc/inittab file).

Note that the “K” and “S” scripts are in fact softlinks to the scripts themselves, which are stored in /etc/init.d/ This is important to remember if you edit scripts.

The K and S scripts are executed in alphanumeric order. Thus, if you are adding scripts, you can control when in the bootup/shutdown process they are executed, by giving an appropriate soft link name in the appropriate run level directory. The kinds of things that can cause boot problems

• The root partition cannot be mounted • A partition could not be automatically fsck’d • A partition has filled up completely – especially the root partition or /var if it is its own partition • The kernel was told to start up in runlevel 1 • Loading errors • Bad startup scripts • Bad /etc/fstab file

What happens when the machine shuts down?

When you shut the machine down cleanly, a number of things happen:

• users logged in are notified the system is going down. (Hopefully you have given them advanced warning of this shutdown.) • all running processes are sent a signal telling them to terminate • all subsystems are shut down gracefully • all users still on the system are logged off • all pending disk updates are completed (sync) • the system is taken to the specified runlevel

sync sync is executed during the shutdown process. It schedules the necessary disk writes so that the system can exit cleanly.

What happens when the system crashes

If your machine does crash, upon the next boot, a full filesystem check will be carried out using fsck.

Fsck goes through the system and tries to recover any corrupt files it finds. Anything resulting from this recovery operation is placed in a directory called /lost+found. Files in this directory will not always be complete or usable; it depends on how successful the recovery was.

More on fsck fsck, (filesystem check) checks the filesystem’s consistency, reports any problems it finds, and optionally repairs problems.

With the exception of the root system, fsck runs on unmounted filesystems. To run fsck on the root system, you must be in single user mode.

You will probably not have to run fsck manually often, but please make sure you read the documentation on this important program so that you will understand how it works, and how to interpret the information it gives you, should a crash occur.

Log files (sources of information!)

/var/log/messages

/var/log/boot.log

/var/log/dmesg

The /etc/nologin file

If a file called /etc/nologin exists, no user can log in. Only root can be logged in at the console. This can be very useful when are troubleshooting and need the whole system to be up and functional, including networking, but don’t want users logging onto the system. If you put information into the /etc/nologin file, that information will be reported to any user trying to log in. E.g. you can let them know the system is down for maintenance and to try back later.

Starting and stopping services

You can start and stop many of the services running on your machine by typing the name of the appropriate startup script, followed by start or stop.

If the script has been written properly, you should be able to find what options are available by just typing the name of the script. For example, typing:

/etc/init.d/sshd will give you information on the options available for running this script.

Login initialisation files

When a user logs in, a number of initialisation files are read. These vary according to what the default shell of the user is.

System wide, the files of interest are:

/etc/profile /etc/zshrc /etc/bashrc

Users can override the defaults set centrally, or add their own settings, using “dot” files in their account. There are a number of possibilities depending on their default shell and what they want exactly. Key files include

~/.profile ~/.login ~/.zshrc ~/.cshrc ~/.bashrc (executed by all non-login instances of bash) ~/.bash_profile (executed by a login bash session – this file is not present, bash uses .profile instead)

These files are sourced at different times: .profile, .bash_login, and .login are executed when a user logs in .zshrc, .cshrc, .bashrc, etc, are executed every time a new shell is spawned.

On this basis – information stored in a .login or .profile file should be that which needs to be executed at login time. This includes such things as:

• setting the PATH • setting the default file protection (with umask) • setting the terminal type • setting other environmental variables

The types of things you might want in a .xxxrc file include:

• setting shell variables • defining aliases

Note: Shell initialization files are executed before login initialisation files – e.g. first .zshrc and then .login

More on this topic can be found on the web at:

http://www.linuxvalley.it/encyclopedia/ldp/howto/HOWTO/mini/Path-6.html http://wwwhepix.web.cern.ch/wwwhepix/wg/scripts/www/shells/user.html

Other useful commands and files

chkconfig

This lets you find out about, and update, runlevel information for your system’s services. In other words, this is a command line tool for maintaining the /etc/rc.d/init.d directory hierarchy. e.g. to find out about all the services that chkconfig knows about, you can run

/sbin/chkconfig --list

A graphical interface to chkconfig can be run by typing:

serviceconf

And if you are so inclined, there is also a text-based interface to chkconfig, which can be run by typing

sudo /usr/sbin/ntsysv

dmesg dmesg reports about the kernel booting. It shows the devices it has found and if it has been able to configure them all. A log of this output can be found at /var/log/dmesg.

Information at the top of /var/log/dmesg includes the kernel version and build.

/etc/motd

The contents of the /etc/motd (message of the day) file appear on a user’s terminal when they first log in. This is an extremely useful way to advertise messages about impending shutdowns if people are logging into your machine using terminal sessions such as ssh.

/sbin/shutdown

Run with different flags, this command allows you to shut down the machine cleanly, and to reboot into designated run levels. It also provides the facilities to provide a warning message to be displayed on user’s terminals, and an option to cancel the shutdown.

/sbin/reboot

Please do not use reboot - use shutdown instead. On Linux systems not in level 0 or 6, using the command reboot will run the command shutdown instead. However, this is not necessarily true on other unix systems where reboot may not shut down the system as cleanly as shutdown.

/sbin/runlevel

This program displays the current and previous system runlevel.

Other documents

Essential System Administration – by Aeleen Frisch, ISBN 1-56592-127-5

The CTDP Linux Startup Manual http://www.comptechdoc.org/os/linux/startupman/index.html Some Suggested Exercises

1) Read /etc/inittab to determine what scripts are executed in what order. Where/when is X started? What does this suggest about troubleshooting X problems?

Read the /etc/rc.d/rc script.

Can you see, looking at the information, why you can run virtual terminals using the Cntl-Alt-F1 through to Cntl-Alt-F6 keys combinations?

2) Read the man page for chkconfig. Find out about all the services chkconfig knows about on the machine you are working on.

Look at the names of the scripts in the init directories for the various run levels, (/etc/rc.d/rc0.d, /etc/rc.d/rc1.d, /etc/rc.d/rc5.d, and so on.) Can you see a connection between the information you see in chkconfig and the scripts that will be executed at the various run levels?

3) Stop sshd running on your machine using the appropriate init.d command. Start sshd running again.