Unix Filesystems

Unix Directory Structure at NERC iTSS November 2005 Contents 1 Introduction 3 2 The UNIX directory tree 4 2.1 Pseudo Filesystems . 5 2.2 Device Files . 6 2.3 Links . 8 2.3.1 Hard links . 8 2.3.2 symbolic or soft links . 8 2.4 Mounting ¯lesystems . 8 2.4.1 Mounting on boot . 9 2.5 File Sharing . 10 2.6 Comparison with Windows ¯lesystems . 11 3 The NERC Directory Structure 13 3.1 Local Structure . 15 3.2 Network Structure . 15 3.2.1 The /users directory . 16 3.2.2 The /data directory . 17 3.2.3 The /packages or /nerc/packages directory . 17 3.2.4 The /nerc directory . 19 3.3 Other common mount points . 20 3.4 Singleton machines . 20 2 Chapter 1 Introduction When learning about UNIX or Linux for the ¯rst time you will often hear that \everything is a ¯le unless it is a process". This may seem odd as your keyboard certainly doesn't look like a ¯le, neither does your monitor, and what about directories? Well, in a very broad sense, the \everything is a ¯le" statement is true. Directories are just ¯les containing a list of other ¯les and things like keyboard and monitor are accessed in the same way as ¯les and appear as special ¯les in the directory tree. In order to read input from the keyboard for instance, it is done by reading from a ¯le. To write a message to the console you write to a ¯le. Thus, ¯les are absolutely central to understanding a UNIX system. 3 Chapter 2 The UNIX directory tree Let's take a look at the basic directory structure of a typical UNIX system. The hierarchical nature is fairly evident with a single root (or trunk) which branches into many directories and subdirectories. The base of the hierarchy is the \/" directory which known as the \root" directory. Below this is the rest of the ¯le system. Here is a list of directories found in the root of a Debian system: bin common programs shared by the system administrator and the users boot the kernel and startup ¯les, possibly boot loader con¯guration dev ¯les that describe hardware etc system con¯guration ¯les home user home directories, mainly for stand-alone machines lib library ¯les media contains common mount points for CD and oppy (Debian) mnt general purpose mount points net mount point for remote ¯le systems opt used for 3rd party software proc special ¯lesystem containing system information root system administrator home directory (Linux) sbin programs used by the system and sysadmin scratch mount point for scratch space (Arbitrary) sys special ¯lesystem for interacting with the kernel tmp temporary space available to the system and users usr subtree of programs, libraries, documentation for the users 4 iTSS 2.1. PSEUDO FILESYSTEMS var storage of variable ¯les such as logs, mail, print spools etc. Underneath /usr you will ¯nd a similar tree to this containing bin, sbin, lib etc. This is historical back when fast storage was expensive. It allowed for the /usr directory tree to be mounted from slower storage unit (likely disk) whilst the essentials on the root partition could be on fast storage (such as drum). Nowadays it is quite unusual to split /usr from the root partition and on Solaris, you will ¯nd /bin is a symbolic link to /usr/bin. This ¯lesystem layout will vary slightly between Solaris, Linux and Irix but not so much that you wont be able to ¯nd your way around. It is worth men- tioning that there is an attempt to produce a standard hierarchy called FHS which should allow software and users to predict the location of installed ¯les and directories. 2.1 Pseudo Filesystems On all Unix systems you will come across \pseudo" ¯les. These are ¯les that are not related to storage space. An example would be /dev/null, known as the bit bucket. Anything written to this ¯le is simply thrown away. Very useful for discarding unwanted output. Sometimes you will come across entire ¯lesystems that dont exist on disk. The most obvious of these is /proc. This virtual ¯lesystem documents kernel and process information and can sometimes be used to tune a running kernel. Here is the \content" of /proc/cpuinfo from my desktop: processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.40GHz stepping : 7 cpu MHz : 2399.994 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid bogomips : 4751.36 This shows that I have a 2.4GHz Pentium 4 clocked at the correct speed and with half a megabyte of cache. I can also tell from this that the processor is capable of hyperthreading but this is obviously disabled (the CPU would look like two indidual cpus to the kernel so there would be further information for processor 1). Lots of interesting information is available through /proc and it is used by common system tools such as ps and top as we shall see later in the course. The 5 iTSS CHAPTER 2. THE UNIX DIRECTORY TREE Solaris version of /proc contains similar information but it is in a less easily readable format. There are many other virtual ¯lesystems available for Linux. /dev is sometimes a dynamic virtual ¯lesystem. This is because /dev was becoming very large as Linux distributions were ¯lling it with device ¯les that described every type of hardware likely to be found on a desktop system. If we can make this ¯lesystem dynamic we can generate the correct device ¯le as we need it. When you come across this ¯lesystem on Linux it is known as udev (there is also a ¯lesystem called devfs which is now obsolete). Other examples of special ¯lesystems include sysfs which is used as an API to the kernel. Along with udev, this is very useful for interacting with hotplug devices that may require additional kernel modules and ¯rmware. tmpfs is a ¯lesytem that exists in memory only which is ideal for /tmp. All contents of a tmpfs ¯lesystem are lost during a power cycle. Its also possible that you will encounter the selinuxfs ¯lesystem which exposes the API of SELinux to userspace. 2.2 Device Files One of the smarter ideas which the unix designers had was to make all ¯les consist of a stream of characters. There are no structures in this stream - no record lengths or block lengths or bu®er lengths, no nonsense about ¯xed length record ¯les and variable length record ¯les. There is just a stream of characters. One of these characters is the newline character (ASCII NL) which is used as a record separator by text applications but unix, itself, does not care about that ( though a printing utility certainly will). Those of you have never had to struggle with data descriptors and such things on IBM mainframes or VAX's will probably never realise just how good an idea this was. Having made this simpli¯cation, they then took the next step and realised that devices such as disks, tapes, terminals and so on can also be represented as streams of characters, at least as far as users are concerned. Inside the kernel, of course, there are device drivers which have to understand about SCSI commands and magnetic tape blocks and so on but this is kept hidden from the user so that, for example: echo 'Hello' > output.txt writes \Hello" into a ¯le called output.txt, and echo 'Hello' > /dev/tty1 writes \Hello" onto the ¯rst terminal. A moment's thought should convince you that what actually happens behind the scenes is quite di®erent in these two cases. /dev/tty1 is actually an interface between user-land and the terminal device driver which is part of the kernel. This may seem like ¯ngernail-painting but it is very useful if you want to write software which is independent of the details of your hardware. In fact, all user-mode software is hardware-independent, by default, and you have to go to extraordinary lengths to make a program dependent on a particular hardware device. 6 iTSS 2.2. DEVICE FILES If you look under the /dev directory you will ¯nd \device ¯les" which represent all the various devices attached to your machine. It is mostly pretty obvious which device ¯les are an interface to which hardware devices but I probably need to say a few words about the Solaris (SPARC) disk naming convention which is very general (It was designed for very large systems with hundreds, even thou- sands, of disks.) but appears unnecessarily convoluted for a small workstation. The device ¯les which represent the disks on a Solaris machine can be found in two directories - /dev/dsk and /dev/rdsk. The names of the ¯les in these two directories are the same and are something like c0t1d0s2. Here, c0 means controller 0 (the ¯rst controller), t1 means SCSI target 1, d0 means LUN 0 or device 0. LUNs are not commonly used except in some RAID systems and tape libraries. s2 means slice (partition) 2. There are up to 8 partitions on a Solaris disk (s0 - s7). They are de¯ned using the format program. By convention, s2 is always a pseudo partition covering the whole disk - do not remove or change this ever.

Load more