Systems Administration Chapter 14: Kernel
Page 333
Chapter
Kernel
The heart that keeps the system pumping
The kernel is the core of the operating system. It is the program that controls the basic
services that are utilised by user programs. It is this suite of basic services in the form
of system calls that make an operating system "UNIX".
The kernel is also responsible for:
· CPU resource scheduling (with the associated duties of process management)
· Memory management (including the important implementation of protection)
· Device control (including providing the device-file/device-driver interface)
· Security (at a device, process and user level)
· Accounting services (including CPU usage and disk quotas)
· Inter Process Communication (shared memory, semaphores and message passing)
The Linux Kernel FAQ sums it up nicely with:
The Unix kernel acts as a mediator for your programs. First, it does the memory
management for all of the running programs (processes), and makes sure that they all
get a fair (or unfair, if you please) share of the processor's cycles. In addition, it
provides a nice, fairly portable interface for programs to talk to your hardware.
Obviously, there is more to the kernel's operation than this, but the basic functions
above are the most important to know.
Other resources
Other resources which discuss kernel related matters include:
· HOW-TOs
Kernel HOWTO, Kerneld mini-HOWTO, LILO mini-HOWTO, GRUB mini-
HOWTO and Modules mini-HOWTO.
The web site lists these
HOWTOs and many more. This is a critical resource to any Systems
Administrator.
· Linux Kernel 2.4 Internals
This manual, available from the Linux Documentation Project (or alternatively on
the Systems Administration CD-ROM), describes the principles and mechanisms
used by the Linux Kernel.
· Linux Kernel Module Programming Guide
This book, available from the LDP, describes how to write kernel modules. It can
be found at: along with many other guides.
· Linux Device Drivers
This O'Reilly book describes how to write device drivers for Linux.
Systems Administration Chapter 14: Kernel
Page 334
· Linux Administration Made Easy (LAME)
A book from the LDP which includes sections on Linux Kernel Upgrades,
Upgrading a Red Hat Stock Kernel, Building a Custom Kernel, and Moving from
one Linux Kernel Version to another. LAME can be found at:
· The Red Hat <VERSION_NUMBER> Reference Guide
Includes a number of sections describing the process for configuring and
compiling kernels.
· The Linux Kernel Archives
This is the primary site for the Linux kernel source.
· The International Kernel Patch
Where the Linux kernel, with fully-fledged cryptographic support, is distributed
(sites in the US can't legally distribute it).
Why the kernel?
Why study the kernel? Isn't that an operating-system-type-thing? What does a
Systems Administrator have to do with the internal mechanics of the OS?
The answer to the above questions is: Lots!
As you are aware, the Linux kernel is based on UNIX. UNIX systems usually
provide the source code for their kernels, making them open systems (there are
exceptions to this in the commercial UNIX world). Having direct access to the source
code allows the Systems Administrator to directly customise the kernel for their
particular system/s. A Systems Administrator might do this for any number reasons
including:
· Hardware changes
They have modified the system hardware (adding devices, memory, processors
etc.)
· Memory optimisation
They wish to optimise the memory usage (called reducing the kernel footprint)
· Improving speed and performance
The speed and performance of the system may need improvement (for example,
modify the quantum per task to suit CPU intensive vs I/O intensive systems).
This process (along with optimising memory) is known as tweaking.
· Kernel upgrades
Improvements to the kernel can be provided in the form of source code. This
allows the Systems Administrator to easily upgrade the system with a kernel
recompile.
Recompiling the kernel is the process whereby the kernel is reconfigured. The source
code is regenerated/recompiled and a linked object is produced. Throughout this
chapter, the concept of recompiling the kernel will mean both the kernel source code
compilation and linkage.
Systems Administration Chapter 14: Kernel
Page 335
How?
In this chapter, we will be going through the step-by-step process of compiling a
kernel, a process that includes:
· Finding out about your current kernel (what version it is and where it is located?)
· Obtaining the kernel (where do you get the kernel source, how do you unpack it
and where do you put it?)
· Obtaining and reading documentation (where can I find out about my new kernel
source?)
· Configuring your kernel (how is this done, what is this doing?)
· Compiling your kernel (how do we do this?)
· Testing the kernel (why do we do this and how?)
· Installing the kernel (how do we do this?)
But to begin with, we really need to look at exactly what the kernel is, at the physical
level, and how it is generated.
To do this, we will examine the Linux kernel, specifically on the x86 architecture.
The lifeless image
The kernel is physically a file that is usually located in the
/boot
directory. Under
Linux, this file is called
vmlinuz
. On my system, an
ls
listing of the kernel
produced:
[root@linuxbox root]# ls -al /boot/vml*
-rwxr-xr-x 1 root root 3063962 Sep 5 02:27 /boot/vmlinux-2.4.18-14
lrwxrwxrwx 1 root root 17 Dec 25 00:17 /boot/vmlinuz -> vmlinuz-2.4.18-14
-rw-r--r-- 1 root root 1085191 Sep 5 02:27 /boot/vmlinuz-2.4.18-14
You can see in this instance that the “kernel file” is actually a link to another file
containing the kernel image.
The actual kernel size will vary from machine to machine. The reason for this is that
the size of the kernel is dependant on what features you have compiled into it, what
modifications you've made to the kernel data structures and what (if any) additions
you have made to the kernel code.
vmlinuz
is referred to as the kernel image. At a physical level, this file consists of a
small section of machine code followed by a compressed block. At boot time, the
program at the start of the kernel is loaded into memory, at which point the rest of the
kernel is uncompressed.
This is an ingenious way of making the physical kernel image on disk as small as
possible; an uncompressed kernel image can be around one megabyte in size.
So what makes up this kernel?
Systems Administration Chapter 14: Kernel
Page 336
Inside the great unknown, the kernel
An uncompressed kernel is really a giant object file; you should remember from past
subjects that an object file is the product of a
C
and assembler linking. It is important
to note that the kernel is not an "executable" file (i.e. you just can't type
vmlinuz
at
the prompt to run the kernel). The actual source of the kernel is stored in the
/usr/src/linux
directory. A typical listing may produce:
[root@linuxbox root]# ls -al /usr/src
total 16
drwxr-xr-x 4 root root 4096 Dec 25 01:25 .
drwxr-xr-x 16 root root 4096 Dec 25 01:38 ..
lrwxrwxrwx 1 root root 15 Dec 25 01:25 linux-2.4 -> linux-2.4.18-14
drwxr-xr-x 17 root root 4096 Dec 25 01:24 linux-2.4.18-14
drwxr-xr-x 7 root root 4096 Dec 25 00:31 redhat
/usr/src/linux-2.4
is a soft link to
/usr/src/<whatever linux 2.4.version>
This means you can store several kernel source trees. However - you MUST change
the soft link of
/usr/src/linux-2.4
to the version of the kernel you will be
compiling, as there are several components of the kernel source that rely on this.
SPECIAL NOTE: If your system doesn't have a
/usr/src/linux-2.4
or a
/usr/src/linux*
directory (where
*
is the version of the Linux source) or there is a
/usr/src/linux-2.4
directory but it only contains a couple of files, then you don't
have the source code installed on your machine.
The quick solution is to install the RPM file containing the kernel source code from
the Red Hat installation CD-ROM.
To install the source from the CD-ROM you need to follow these steps:
· Firstly, if you are not already at the command line, open up a terminal window
· Place the first installation disk into your CD-ROM
· Mount the CDROM using the mount command, for example:
mount /mnt/<CD-ROM>
· Use the following command to move to the CD-ROM directory:
cd /mnt/<CD-ROM>
· Copy the RPM file of your choice to the
/usr/src
directory using the following
command:
cp <FILE_NAME>.rpm /usr/src
· Use the
cd
command to move to the
/usr/src
directory
· Use the following command to run the source RPM on your system:
rpm –Uvh <FILE_NAME>.rpm
· You should see a progress indicator for the installation. If no errors occur the
source code will be installed in the directory
/usr/src/linux-<VERSION
NUMBER>.
As an alternative you might want to download a more recent version of the kernel
source from the Internet. You can download it from
Alternatively, the Kernel HOWTO describes another way of obtaining it:
Systems Administration Chapter 14: Kernel
Page 337
You can obtain the source via anonymous ftp from
ftp.kernel.org
in
/pub/linux/kernel/vx.y
, where
x.y
is the version(eg 2.4), and as mentioned
before, the ones that end with an odd number are development releases and may be
unstable. It is typically labelled
linux-x.y.z.tar.gz
, where
x.y.z
is the version
number. The sites also typically carry ones with a suffix of
.bz2
, which have been
compressed with bzip2 (these files will be smaller and take less time to transfer).
It's best to use ftp.xx.kernel.org where xx is your country code; examples being
ftp.at.kernel.org for Austria, and ftp.us.kernel.org for the United States.
Generally you will only want to obtain a "stable" kernel version. For the first time
Linux user, the number of versions that are available can be a little daunting. Which
version do I want to download? How can I be sure that it is not a developmental
version?
The answers are not as hard as they first seem. Linux kernel versions are almost
always denoted by three numbers separated by dots. These three numbers have a
specific format:
major_version.minor_version.bugfix
. As an example,
2.4.20
has the major number of
2
, the minor number of
4
and the bugfix of
20
.
Additional rules associated with this format include:
· “Even” minor versions (
0
,
2
,
4
, etc) represent stable versions
· “Odd” minor versions (
1
,
3
,
5
, etc) represent developmental versions
This means that the version
2.4.20
is a stable version and
2.3.20
is
developmental.
· The bugfixes are sequential
Version
2.4.20
is newer than version
2.4.10
.
As of this writing (Mar, 2003), the stable kernel version is
2.4
and the developmental
version is
2.5
. Much discussion and excitement currently abounds on many Internet
discussion lists about the significant performance increases provided by the
forthcoming
2.6
Linux kernel.
Okay, now that you are able to identify the version that you want to download, the
next step is to identify which format you should use. Basically there are two main
formats: Red Hat Package Management (
rpm
) or Tarball (
tar
). The
rpm
file is the
easier format to install. Once you have downloaded the
rpm
for the version that you
want, you simple have to follow the 5
th
step onwards in the instructions above about
how to install the source.
If you still want to try installing the source code from a
tar
file, it is not really that
difficult. To unpack the source, you need to change working directory to
/usr/src
and run the command
tar zxpvf linux-x.y.z.tar.gz
(if you've just got a
.tar
file with no
.gz
at the end,
tar xpvf linux-x.y.z.tar
will do the job). The
contents of the source will fly by. When finished, there will be a new
linux-
<VERSION-NUMBER>
directory in
/usr/src
directory, and that’s it.
Red Hat puts their kernel in
/usr/src/linux-2.4
rather than the standard
/usr/src/linux
. You need to make sure that this soft link exists. The following
demonstrates:
Lrwxrwxrwx 1 root root 15 Dec 25 01:25 linux-2.4 -> linux-2.4.18-14
drwxr-xr-x 17 root root 4096 Dec 25 01:24 linux-2.4.18-14
When you install other versions of the kernel source, you simply change where the
soft link points. NEVER just delete your old source - you may need it to recompile
Systems Administration Chapter 14: Kernel
Page 338
your old kernel version if you find the new version isn't working out, though we will
discuss other ways around this problem in later sections.
Now that everyone should have the source code installed on their systems, we can
have a look at some of the basic files and directories and what they include and are
used for. A typical listing of
/usr/src/linux-2.4
produces:
drwxr-xr-x 17 root root 4096 Dec 25 01:24 .
drwxr-xr-x 4 root root 4096 Dec 25 01:25 ..
drwxr-xr-x 11 root root 4096 Dec 25 01:23 abi
drwxr-xr-x 3 root root 4096 Dec 25 01:23 arch
drwxr-xr-x 2 root root 4096 Dec 25 01:23 configs
-rw-r--r-- 1 root root 18691 Sep 5 01:53 COPYING
-rw-r--r-- 1 root root 79886 Sep 5 01:54 CREDITS
drwxr-xr-x 4 root root 4096 Dec 25 01:23 crypto
drwxr-xr-x 31 root root 4096 Dec 25 01:23 Documentation
drwxr-xr-x 42 root root 4096 Dec 25 01:24 drivers
drwxr-xr-x 47 root root 4096 Dec 25 01:24 fs
drwxr-xr-x 11 root root 4096 Dec 25 01:24 include
drwxr-xr-x 2 root root 4096 Dec 25 01:24 init
drwxr-xr-x 2 root root 4096 Dec 25 01:24 ipc
drwxr-xr-x 2 root root 4096 Dec 25 01:24 kernel
drwxr-xr-x 4 root root 4096 Dec 25 01:24 lib
-rw-r--r-- 1 root root 42807 Sep 5 01:54 MAINTAINERS
-rw-r--r-- 1 root root 20686 Sep 5 02:15 Makefile
drwxr-xr-x 2 root root 4096 Dec 25 01:24 mm
drwxr-xr-x 31 root root 4096 Dec 25 01:24 net
-rw-r--r-- 1 root root 14239 Sep 5 01:53 README
-rw-r--r-- 1 root root 2815 Apr 7 2001 REPORTING-BUGS
-rw-r--r-- 1 root root 9209 Sep 5 01:53 Rules.make
drwxr-xr-x 4 root root 4096 Dec 25 01:24 scripts
Within this directory hierarchy there are in excess of 8000 files and directories. On
my system this consists of around 3280
C
source code files, 2470
C
header files, 40
Assembler source files and 260 Makefiles. These, when compiled, produce around
300 object files and libraries. At a rough estimate, this consumes around 16 Mb of
space (this figure will vary).
Take a few minutes to have a look at some of the code found within the files in each
of the directories. For example the kernel directory contains all of the code associated
with the kernel itself. Use
vi
to have a look at the code, but be careful not to
accidentally modify the code, as this can cause problems when you try to compile at a
later stage. The various other directories form logical divisions of the code, especially
between the architecture dependant code (
linux-2.4/arch
), drivers (
linux-
2.4/drivers
) and architecture independent code. By using
grep
and
find
, it is
possible to trace the structure of the kernel program, look at the boot process and find
out how various parts of it work.
While this may seem like quite a bit of code, much of it actually isn't used in the
kernel. Quite a large portion of this is driver code; only drivers that are needed on the
system are compiled into the kernel, and then only those that are required at run time
(the rest can be placed separately in things called modules; we will examine this topic
later).
Systems Administration Chapter 14: Kernel
Page 339
Documentation
Every version of the kernel source comes with documentation. You should read the
following files before compiling the kernel:
·
/usr/src/linux-2.4/README
or
/usr/src/linux/INSTALL
Instructions on how to compile the kernel.
·
/usr/src/linux-2.4/MAINTAINERS
A list of people who maintain the code.
· /usr/src/linux-2.4/Documentation/*
Documentation for parts of the kernel
.
It is critical that you ALWAYS read the documentation after obtaining the source
code for a new kernel, especially if you are going to be compiling in a new kind of
device. The Linux Kernel-HOWTO ( />HOWTO.html) is essential reading for anything relating to modifying or compiling
the kernel.
The final place that you are able to find information about the kernel is within the
code itself. The development of Linux is the collaborative product of many people.
This usually means that the code (in general) is neat but sparsely commented. The
comments that do exist can be strange and at times ridiculous. Apart from providing
light entertainment, the kernel source comments can be an important guide into the
(often obscure) internal workings of the kernel. If you have not spent some time
having a look at the code it would be a good time to do so now.
The first incision
An obvious place to start with any large
C
program is the
void main(void)
function.
If you
grep
every source file in the Linux source hierarchy for this function name,
you will be sadly disappointed.
As I pointed out earlier, the kernel is a giant object file - a series of compiled
functions. It is NOT executable. The purpose of
void main(void)
in
C
is to
establish a framework for the linker to insert code that is used by the operating system
to load and run the program. This wouldn't be of any use for a kernel - it is the
operating system!
This poses a difficulty - how does an operating system run itself?
Making the heart beat...
In the case of Linux, the following steps are performed to boot the kernel:
· The boot loader program (for example GRUB or LILO) starts by loading the
vmlinuz
from disk into memory. This starts the code execution.
· After the kernel image is decompressed, the actual kernel is started. This part of
the code was produced from assembler source; it is totally machine specific. The
code for this is located in the
/usr/src/linux-2.4/arch/i386/kernel/head.S
file. Technically, at this point the kernel is running. This is the first process (
0
)
and is called
swapper
.
swapper
does some low level checks on the processor,
memory and floating point unit (FPU) availability, then places the system into
protected mode. Paging is also enabled.
Systems Administration Chapter 14: Kernel
Page 340
· At this stage, all interrupts are disabled, though the interrupt table is set up for
later use. The entire kernel is realigned in memory (post paging) and some of the
basic memory management structures are created.
· At this point, a function called
start_kernel
is called.
start_kernel
is
physically located in
/usr/src/linux-2.4/init/main.c
and is actually the core
kernel function - really the equivalent of the
void main(void)
.
main.c
itself is
virtually the root file for all other source and header files.
· Tests are run (the FPU bug in Pentium chip is identified amongst other checks
including examinations on the Direct Memory Access (DMA) chip and bus
architecture) and the BogoMips setting is established.
Aside: BogoMips (Bogus Millions of Instructions Per Second) are an interesting
and amusing phenomenon of Linux. To find out more do a Google search on the
topic.
·
start_kernel
sets up the memory, interrupts and scheduling. In effect, the
kernel is now multi-tasking enabled. At this stage the console has had several
messages displayed to it.
· The kernel command line options are parsed (those passed in by the boot loader),
and all embedded device driver modules are initialised.
· Further memory initialisations occur, socket/networking is started and further bug
checks are performed.
· The final action performed by
swapper
is the first process creation with
fork
whereby the
init
program is launched.
swapper
now enters an infinite idle
loop.
It is interesting to note that as a linear program, the kernel has finished running! The
timer interrupts are now set so that the scheduler can step in and pre-empt a running
process. However, other processes will periodically execute sections of the kernel.
The above process is a huge oversimplification of the kernel's structure, but it does
give you the general idea of what it is, what it is made up of and how it loads.
The proc file system
Part of the kernel's function is to provide a file-based method of interaction with its
internal data structures. It does this via the
/proc
virtual file system.
The
/proc
file system technically isn't a file system at all; it is in fact a window into
the kernel's internal memory structures. Whenever you access the
/proc
file system,
you are really accessing the memory allocated to the kernel.
So what does it do?
Effectively the
/proc
file system provides an instant snapshot of the status of your
system. This includes memory, CPU resources, network statistics and device
information. This data can be used by programs, such as
top
, to gather information
about a system.
top
scans through the
/proc
structures and is able to present the
current memory, CPU and swap information, as given below:
Systems Administration Chapter 14: Kernel
Page 341
9:59pm up 31 min, 2 users, load average: 0.00, 0.00, 0.00
50 processes: 48 sleeping, 2 running, 0 zombie, 0 stopped
CPU states: 0.9% user, 0.7% system, 0.0% nice, 98.2% idle
Mem: 126516K av, 80988K used, 45528K free, 0K shrd, 8508K buff
Swap: 257000K av, 0K used, 257000K free 39228K cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
971 root 15 0 1000 1000 816 R 1.3 0.7 0:00 top
614 root 15 0 1428 1428 1280 S 0.3 1.1 0:02 sshd
1 root 15 0 460 460 408 S 0.0 0.3 0:04 init
2 root 15 0 0 0 0 SW 0.0 0.0 0:00 keventd
3 root 15 0 0 0 0 SW 0.0 0.0 0:00 kapmd
4 root 34 19 0 0 0 SWN 0.0 0.0 0:00 ksoftirqd_CPU0
5 root 15 0 0 0 0 SW 0.0 0.0 0:00 kswapd
6 root 25 0 0 0 0 SW 0.0 0.0 0:00 bdflush
7 root 15 0 0 0 0 SW 0.0 0.0 0:00 kupdated
8 root 25 0 0 0 0 SW 0.0 0.0 0:00 mdrecoveryd
12 root 15 0 0 0 0 SW 0.0 0.0 0:00 kjournald
68 root 16 0 0 0 0 SW 0.0 0.0 0:00 khubd
161 root 15 0 0 0 0 SW 0.0 0.0 0:00 kjournald
162 root 16 0 0 0 0 SW 0.0 0.0 0:00 kjournald
163 root 15 0 0 0 0 SW 0.0 0.0 0:00 kjournald
164 root 15 0 0 0 0 SW 0.0 0.0 0:00 kjournald
405 root 15 0 0 0 0 SW 0.0 0.0 0:00 eth0
454 root 15 0 568 568 484 S 0.0 0.4 0:00 syslogd
458 root 15 0 432 432 376 S 0.0 0.3 0:00 klogd
475 rpc 15 0 524 524 448 S 0.0 0.4 0:00 portmap
494 rpcuser 17 0 728 728 636 S 0.0 0.5 0:00 rpc.statd
576 root 16 0 488 488 436 S 0.0 0.3 0:00 apmd
628 root 15 0 908 908 780 S 0.0 0.7 0:00 xinetd
651 root 15 0 2224 2224 1640 S 0.0 1.7 0:00 sendmail
The actual contents of the
/proc
file system on my system look like:
[root@linuxbox linux-2.4]# ls -F /proc
1/ 164/ 454/ 576/ 661/ 711/
798/ 804/ 841/ 867/ 982/ 993/
cpuinfo execdomains interrupts kcore mdstat net/
speakup/ tty/ 12/ 2/ 458/ 6/
671/ 720/ 8/ 805/ 842/ 868/
983/ 994/ devices fb iomem kmsg
meminfo partitions stat uptime 161/ 3/
475/ 614/ 68/ 729/ 801/ 806/
847/ 918/ 986/ apm dma filesystems
ioports ksyms misc pci swaps version
162/ 4/ 494/ 628/ 680/ 763/
802/ 807/ 858/ 920/ 988/ bus/
dri/ fs/ irq/ loadavg modules self@
sys/ 163/ 405/ 5/ 651/ 7/
791/ 803/ 814/ 859/ 979/ 991/
cmdline driver/ ide/ isapnp locks mounts@
slabinfo sysvipc/
Each of the above numbered directories store “state” information of the process, by
their PID. This means that the directory
1/
stores all of the state information for the
process with a PID of
1
. The
self/
directory contains information for the process
that is viewing the
/proc
file system, i.e. YOU. The information stored in this
directory looks like:
cmdline (Current command line)
cwd - [0303]:132247 (Link to the current working directory)
environ (All environment variables)
exe - [0303]:109739 (Currently executing code)
fd/ (Directory containing virtual links to
file handles)
maps| (Memory map structure)
root - [0303]:2 (Link to root directory)
stat (Current process statistics)
statm (Current memory statistics)
Systems Administration Chapter 14: Kernel
Page 342
Most of these files can be displayed to the screen using the
cat
command. The
/proc/filesystems
file, when cat'ed, lists the supported file systems. The
/proc/cpuinfo
file gives information about the hardware of the system:
psyche:~$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 5
model : 4
model name : Pentium MMX
stepping : 3
cpu MHz : 200.456
fdiv_bug : no
hlt_bug : no
f00f_bug : yes
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr mce cx8 mmx
bogomips : 399.76
You need to be aware that by upgrading the kernel, it may cause changes to the
structure of the
/proc
file system. This may require additional software upgrades.
Information about this should be provided in the kernel README files within the
specific Linux directory you have upgraded to.
Exercises
14.1.
What is the contents of the file
/proc/modules
? What do you think it
represents?
14.2.
Find out where your kernel image is located and how large it is.
14.3.
Examine the
/proc
file system on you computer. What do you think
the
/proc/kcore
file is? Hint: Have a look at the size of the file.
Really, why bother?
After all of these readings you might be asking yourself, why bother with recompiling
the system? It is working just fine the way it is. This is actually a very good
question. This section looks at some of the main reasons that you as a Systems
Administrator may want or need to recompile the system.
· Surprisingly the best time to recompile your kernel is straight after you've
installed Linux onto a new system. The reason for this is that the original Linux
kernel provided has extra drivers compiled into it; these extra drivers consume
large amounts of memory. By recompiling the kernel and identifying only the
components that are required for your specific system, you can make the system
run much more effectively and efficiently.
· Some installations don’t have support for some very common sound cards and
network devices! This is done intentionally and to be fair, there are good reasons
for this (Interrupt Request (IRQ) conflicts etc), but this does mean a kernel
recompile is required.
· One of the most common reasons why people recompile the kernel is to add a new
component or device to their system. An example might be to add a new piece of
hardware and they want the hardware device driver to be part of the kernel itself