Background

The Linux kernel has got dynamically loadable module support since version 2.0. This means that many parts of the kernel can be compiled as modules, which can be loaded (i.e. attached to the kernel) and unloaded (i.e. detached) at runtime—without shutting down the running system. It also means that most users no longer need to recompile their kernels to enable certain drivers for their specific hardwares. Rather, distribution vendors compile a very generic base kernel with lots of loadable modules for most supported devices. Most distributions come with so many loadable modules that the user barely NEEDS to recompile their kernels.

What was nice is that the kernel together with the the module (un)loader ``modprobe'' has an automatic module (un)loading support: when you try to access a device and the module required to support it is not yet loaded, the kernel can trigger the module loader to load that module on the fly. (Of course, this requires the module loader to be configured properly, which is the case in most distributions.) The other half of this story is that if you have finished using a device (think about a USB thumb drive), the module responsible (e.g. usb-storage) for it will be automatically unloaded after a certain timeout.

Why unloading modules

Some people may wonder why there is a need to unload a module once it is loaded. There are several reasons for which this is desired:
  1. Loaded modules take up UNSWAPPABLE memory. Having unused modules thus means having less physical RAM for processes, reaching the critical thrashing point earlier.
  2. Some modules simply conflict one another and hence cannot be loaded simultaneously. e.g. 2 modules may use the same serial port, hence fighting each other. (see lirc_sir and irda.) Yet, if you load only one of them at a time (and unload them when finished), you can happily use all of them in a time-separated manner.
  3. Of course, when a bug in a module is found, and there is a replacement, you'd want to unload the old one and load the new one.
Of these, (3) cannot be handled by automatic module (un)loading. You have to do it manually. But since it is not a part of the daily routines, it doesn't matter much. (2) doesn't occur too often, either. (1) above is the major factor that motivates me to configure the system to auto-unload unused modules. I want to free up more RAM for the apps and buffer-cache to use.

The ``improvement'' in Linux 2.6

The above module auto-(un)loading mechanism has been working fine for many years (from Linux 2.0.* to Linux 2.4.*). But in Linux 2.6, the kernel developers decided to replace the old thing with a new implementation. The new modprobe is now much simpler (i.e. less featureful) in 2.6. In particular, the automatic unloading support is no longer there.

Hey, it's not difficult: write a simple shell or Perl script to parse the contents of /proc/modules and remove those with a usage count of zero with the rmmod command. We can schedule this script to run every 10 minutes, so that modules that are no longer in use for 10 minutes will certainly be unloaded.

This doesn't work, unfortunately. This is due to some changes in the handling of module usage counts. Before 2.6, modules were responsible for maintaining the reference on their own. When the device file is opened, the module increases the count. When the file is closed, the count is decreased. Modules that are not accessed via device files (e.g. network interfaces) do similarly things in other places. Since a module is the best party who knows when it is being used and when not, this is a good idea.

However, the kernel developers think this is not a good idea, because it breaks in some cases. The principle (which I agree on) is that you can't keep the usage count of a module correctly inside the module. Reference counts should be kept by an outsider. This is true for avoiding race conditions. So, the design decision is to take away the task of maintenance the reference count from modules. Instead, the core kernel would take care of it. The modules need not care about it. Isn't that nice? Well... not always. Think about how the kernel maintains the usage count for modules. For device files, it increases the reference count on opening of the file, and decreases the count upon a close. OK. For network interfaces, the core kernel does similar things. The problem arises in cases where the modules are used by certain means that the kernel developers cannot predict. Not a problem. The writers of these modules should be aware of this, and call the module_get function explicitly to increase its usage count. That's fine. But...

Modules as Virtual Hardware

Another big difference in the 2.6 kernel is that there are new many modules that are like ``virtual hardware''. These modules only need to be loaded to operate. They add functionalities to the kernel. There is no need to open any special files on them, nor network interfaces, etc. These modules are also designed to be unloaded at will. When they're removed, the functionalities they provide nicely disappears from the kernel. Since they're meant to be removed at will, their usage count must be zero. But although their usage count is zero, removing them does cause side effects: the functionalities they provide would go away. These modules are a big problem for implementing auto-(un)loading. There is no device file to access to trigger the loading of these modules. And there is no way to know whether they're really in use, because their usage count is always zero.

As an example, look at the input layer. The kernel now provides a /dev/input/mice device file, which consolidates all mice input and combine them into a single stream. But where are the mice? Well... you have to load the module mousedev for them. mousedev is a ``virtual hardware'' to provide mouse input events to /dev/input/mice. But even mousedev does not talk to the mice. It is just a mediator. You still need to load the module psmouse to bridge the gap between mousedev and a PS/2 mouse. And you need sermouse to connect mousedev to a COM-port mouse. Now, if you do a lsmod, you'll find that the usage count of psmouse or sermouse is zero. Yes, it's zero, even though you're using the mouse happily under XFree86. If you try to rmmod psmouse, haha... you can get through and the PS/2 mouse is detached from /dev/input/mice. The mouse no longer works (until you load the psmouse module again). How come? A device is being used, and its kernel driver module can be removed!

Another example is the keyboard. Most people won't encounter this problem, because they'd compile the keyboard driver as a built-in part of the core kernel. But those that are adventurous to try it out would find the related modules (on the i386 platform): i8042, serio, atkbd. This time, i8042 is the real hardware driver; serio is an abstraction layer above it, and atkbd connects this to the /dev/input/* files. Again, on any running system, you can rmmod atkbd and the keyboard is disabled. You can do it even when someone is typing on the keyboard. The usage count of atkbd is zero, whether there are consoles connected to it or not!!! What's more severe: if you have loaded all these 3 modules, and now you rmmod i8042. Oops! Well... not an Oops! in the sense of a ``kernel oops''. Normally, when the kernel detects something inconsistent, it ``oops'' and prints some error messages on the console. But this time, the machine simply hangs. A complete freeze. No keyboard response (maybe because i8042 is unloaded already). No response to ping from other machines (how come?). This phenomenon (if not a bug) is observed in kernel versions 2.6.5, 2.6.6-rc3, 2.6.6 and 2.6.7-rc1. Maybe, it has been there since 2.6.0.

Update (2004-05-10)

This kernel hang triggered by rmmod i8042 was eventually identified by me as a bug in the source file drivers/input/serio/i8042.c. For details and patches, please read my patch posted to the Linux Kernel Mailing List. Disappointingly, 2.6.7-rc1 (released on 2004-05-23) still has NOT incorporated my patch. After repeatedly urging the kernel maintainers, the fix is eventually incorporated into Linux kernel 2.6.7 (2004-06-16).

My workaround—modused

I have developed a module called modused. It's purpose is to simply create artificial, superficial usage counts to any modules you like. It pretends to use the modules you name, so that the kernel will not want to unload them. Of course, you can decrease the usage counts again when you need to unload the modules.

The user interface

After loading this module, you'll find the file /proc/modused. You can then interactive with the module by writing to or reading from this file. The file is created with permissions -rw-------. You may use chmod to relax the restrictions.

To increase the usage count of module xxx, use the command

    echo xxx > /proc/modused
You can compare the output of lsmod before and after issuing this command. You'll notice that the usage count of module xxx is increased by 1. This is enough to prevent rmmod from unloading it. You can then repeat this to add any other modules you want to keep a non-zero usage count. (You can even do it for the same module multiple times! The usage count will be increased that many times accordingly.) If module xxx has not been loaded yet, then the above command will encounter an error (which is not reported by the shell command echo— you can check if ``echo $?'' returns zero (success) or non-zero (failure).) and a kernel error message will be printed.

To check which modules have been held ``in use'' by modused, simply read from the device:

    cat /proc/modused
Note that the module names will appear in the same order as you added. Modules that have been repeatedly added will appear a number of times on this list, accordingly.

To remove module xxx from the list, use the command

    echo -xxx > /proc/modused
And it will be removed from the list. The usage count (as in the output of lsmod) will be decreased by 1. This can bring down the usage count of module xxx to zero, so that you can remove it with ``rmmod xxx''. If xxx was not on the list show by ``cat /proc/modused'', the above command will fail with a kernel error message. Again ``echo'' does report the error, although you can check it using ``echo $?''. If you added module xxx several times, then the above command removes only one added instance. The other instances are still kept and shown in ``cat /proc/modused''. You need to perform the above many times to remove all the usage references held by modused to that module.

Finally, there is a quick way to undo everything that modused has done, bringing the usage count of the affected modules back to normal. This is by unloading the modused module.

The chicken and the egg

If you find modused useful, you would of course want to prevent it from being unloaded accidently (e.g. via a script that removes everything that show up in ``lsmod'' with a zero usage count). But modused is such a victim!!!

Exercise: How to fool the kernel into thinking that modused has a usage count of once, and hence preventing a rmmod modused?

The code

The gzipped patch should be applied on kernel version 2.6.5 or 2.6.6. It also works on 2.6.7-rc1 and 2.6.7. It may work on other versions, too. Good luck!

After patching, configure the kernel as usual and enable the item "Loadable Module Support"/"The /proc/modused facility". Then, compile the kernel as usual. I have only tried compiling it as a module, but you're welcomed to try to compile it into the core kernel. Tell me if it works or not.

Todo's and/or open questions

  1. Should modused be placed under /dev/ or /proc (or both)?

    After a short discussion with Tuukka Toivonen, I have decided to put it under /proc. The reason is that the interface of the file /proc/modused resembles that of other files in /proc rather than a normal char device.

  2. Although lsmod does show the usage count as desired, it doesn't show modused in the ``Used by'' column. Is it a good idea to make modused be shown there? This would make modused relatively more complicated, and require more changes to module.c. The question is whether it worths the extra coding, since you can always ``cat /proc/modused'' to find out which modules are being held by modused. A simple shell/Perl script can thus combine the contents of /proc/modules and /proc/modused to produce a nice listing. (Doing it in userland in a scripting language is much more flexible and doesn't waste valuable kernel memory, which is non-swappable.)

Last updated: $Date: 2004/06/16 11:12:10 $ (UTC)