Section 14: The UNIX File System
Most UNIX machines store their files on
magnetic disk drives. A disk drive is a device that can store information by
making electrical imprints on a magnetic surface. One or more heads skim close
to the spinning magnetic plate, and can detect, or change, the magnetic state of
a given spot on the disk. The drives use disk controllers to position the head
at the correct place at the correct time to read from, or write to, the magnetic
surface of the plate. It is often possible to partition a single disk drive into
more than one logical storage area. This section describes how the UNIX
operating system deals with a raw storage device like a disk drive, and how it
manages to make organized use of the space.
How the UNIX file system works
Every item in a UNIX file system can de
defined as belonging to one of four possible types:
- Ordinary files
- Ordinary files can contain text, data, or program information. An ordinary
file cannot contain another file, or directory. An ordinary file can be
thought of as a one-dimensional array of bytes.
- Directories
- In a previous section, we described directories as containers that can
hold files, and other directories. A directory is actually implemented as a
file that has one line for each item contained within the directory. Each line
in a directory file contains only the name of the item, and a numerical
reference to the location of the item. The reference is called an
i-number, and is an index to a table known as the i-list. The
i-list is a complete list of all the storage space available to the file
system.
- Special files
- Special files represent input/output (i/o) devices, like a tty (terminal),
a disk drive, or a printer. Because UNIX treats such devices as files, a
degree of compatibility can be achieved between device i/o, and ordinary file
i/o, allowing for the more efficient use of software. Special files can be
either character special files, that deal with streams of characters,
or block special files, that operate on larger blocks of data. Typical
block sizes are 512 bytes, 1024 bytes, and 2048 bytes.
- Links
- A link is a pointer to another file. Remember that a directory is nothing
more than a list of the names and i-numbers of files. A directory entry can be
a hard link, in which the i-number points directly to another file. A
hard link to a file is indistinguishable from the file itself. When a hard
link is made, then the i-numbers of two different directory file entries point
to the same inode. For that reason, hard links cannot span across file
systems. A soft link (or symbolic link) provides an indirect
pointer to a file. A soft link is implemented as a directory file entry
containing a pathname. Soft links are distinguishable from files, and can span
across file systems. Not all versions of UNIX support soft links.
The I-List
When we speak of a UNIX file system, we are actually
referring to an area of physical memory represented by a single i-list. A UNIX
machine may be connected to several file systems, each with its own i-list. One
of those i-lists points to a special storage area, known as the root file
system. The root file system contains the files for the operating system
itself, and must be available at all times. Other file systems are removable.
Removable file systems can be attached, or mounted, to the root file
system. Typically, an empty directory is created on the root file system as a
mount point, and a removable file system is attached there. When you issue a cd
command to access the files and directories of a mounted removable file system,
your file operations will be controlled through the i-list of the removable file
system.
The purpose of the i-list is to provide the operating system with a map into
the memory of some physical storage device. The map is continually being
revised, as the files are created and removed, and as they shrink and grow in
size. Thus, the mechanism of mapping must be very flexible to accomodate drastic
changes in the number and size of files. The i-list is stored in a known
location, on the same memory storage device that it maps.
Each entry in an i-list is called an i-node. An i-node is a complex
structure that provides the necessary flexibility to track the changing file
system. The i-nodes contain the information necessary to get information from
the storage device, which typically communicates in fixed-size disk
blocks. An i-node contains 10 direct pointers, which point to disk blocks on
the storage device. In addition, each i-node also contains one indirect
pointer, one double indirect pointer, and one triple indirect
pointer. The indirect pointer points to a block of direct pointers. The
double indirect pointer points to a block of indirect pointers, and the triple
indirect pointer points to a block of double indirect pointers. By structuring
the pointers in a geometric fashion, a single i-node can represent a very large
file.
It now makes a little more sense to view a UNIX directory as a list of
i-numbers, each i-number referencing a specific i-node on a specific i-list. The
operating system traces its way through a file path by following the i-nodes
until it reaches the direct pointers that contain the actual location of the
file on the storage device.
The file system table
Each file system that is mounted on a UNIX machine
is accessed through its own block special file. The information on each of the
block special files is kept in a system database called the file system table,
and is usually located in /etc/fstab. It includes information about the name of
the device, the directory name under which it will be mounted, and the read and
write privileges for the device. It is possible to mount a file system as
"read-only," to prevent users from changing anything.
File system quotas
Although not originally part of the UNIX filesystem,
quotas quickly became a widely-used tool. Quotas allow the system administrator
to place limits on the amount of space the users can allocate. Quotas usually
place restrictions on the amount of space, and the number of files, that a user
can take. The limit can be a soft limit, where only a warning is
generated, or a hard limit, where no further operations that create files
will be allowed.
The command
- quota
will let you know if you're over your soft limit.
Adding the -v option will provide statistics about your disk usage.
File system related commands
Here are some commands related to file
system usage, and other topics discussed in this section:
- bdf
- On HP-UX systems, reports file system usage statistics
- df
- On HP-UX systems, reports on free disk blocks, and i-nodes
- du
- Summarizes disk usage in a specified directory hierarchy
- ln
- Creates a hard link (default), or a soft link (with -s option)
- mount, umount
- Attaches, or detaches, a file system (super user only)
- mkfs
- Constructs a new file system (super user only)
- fsck
- Evaluates the integrity of a file system (super user only)
A brief tour of the UNIX filesystem
The actual locations and names of
certain system configuration files will differ under different inplementations
of UNIX. Here are some examples of important files and directories under version
9 of the HP-UX operating system:
- /hp-ux
- The kernel program
- /dev/
- Where special files are kept
- /bin/
- Executable system utilities, like sh, cp, rm
- /etc/
- System configuration files and databases
- /lib/
- Operating system and programming libraries
- /tmp/
- System scratch files (all users can write here)
- /lost+found/
- Where the file system checker puts detached files
- /usr/bin/
- Additional user commands
- /usr/include/
- Standard system header files
- /usr/lib/
- More programming and system call libraries
- /usr/local/
- Typically a place where local utilities go
- /usr/man
- The manual pages are kept here
Other places to look for useful stuff
If you get an account on an
unfamiliar UNIX system, take a tour of the directories listed above, and
familiarize yourself with their contents. Another way to find out what is
available is to look at the contents of your PATH environment variable:
- echo $PATH
You can use the ls command to list the contents
of each directory in your path, and the man command to get help on unfamiliar
utilities. A good systems administrator will ensure that manual pages are
provided for the utilities installed on the system.