Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Absolute BSD - The Ultimate Guide To FreeBSD (2002).pdf
Скачиваний:
25
Добавлен:
17.08.2013
Размер:
8.15 Mб
Скачать

Installing Extra Linux Packages as RPMs

When a Linux program complains that it cannot find a necessary program, you may need to add that program under /usr/compat/linux. Since FreeBSD's Linux mode is based on Red Hat Linux, you can easily grab the appropriate components of Red Hat Linux and install them in your Linux subsystem.

Red Hat Linux is distributed in RPM (Red Hat Package Manager) format. (You can find a good selection of Red Hat Linux RPMs at FTP mirror sites around the world; see http://www.redhat.com/ for the latest mirror list.) RPM files are like FreeBSD's binary packages; they're just compressed files containing everything needed to run a program, and they are designed to be installed and uninstalled as a unit. Although people argue about the merits of RPM versus pkg_add versus the many other package−management systems used by opensource software, since FreeBSD's Linux compatibility package is based on Red Hat Linux, we use Red Hat tools.

When using RPMs, be certain to install the software under /compat/linux. If you just blindly run RPM as described in the rpm(8) man page, you'll wind up overwriting part of your FreeBSD system. This would be bad; while FreeBSD can run Linux binaries, you cannot combine a FreeBSD and Linux userland arbitrarily and expect anything to work. Trying this is a good way to become familiar with the emergency repair process described in Chapter 3.

To safely install an RPM, do this:

...............................................................................................

# rpm −i −−ignoreos −−dbpath /var/lib/rpm −−root /compat/linux packagename

...............................................................................................

Note Of course, RPM packages are completely separate from FreeBSD's usual package system. You cannot pkg_delete these; you must use RPM to handle them.

Using Multiple Processors—SMP

Computers with multiple CPUs have been around for decades, but they are just now becoming popular in the Intel−compatible world. FreeBSD has supported the use of multiple CPUs since version 3, but hardware is just now becoming affordable enough for small companies and hobbyists to implement it.

What Is SMP?

Symmetric multiprocessing (SMP) describes a system with multiple (more than one) identical processors. Before you ask: Yes, there are other variants on multiple−processor handling that might be used some day. Some computer scientists insist that asymmetrical multiprocessing will be more efficient. You can't buy that hardware, however, so it's moot at the moment.

SMP has quite a few advantages over single processors, and it's not the obvious "more power!" If you think about it on the microscopic level, a CPU can only do one thing at a time. Every process on the computer competes for processor time. If the CPU is performing a database query, it isn't accepting the packet that the Ethernet card is trying to deliver. Every fraction of a second, the CPU does a context switch and works on some other process assigned by the kernel. This happens often

263

enough and quickly enough that it appears to be doing many things at once, much as a television picture appears to move by showing individual frames very quickly. With multiple processors, your computer can do multiple things simultaneously. This can be a wonderful thing, but it increases system complexity dramatically.

Since one CPU can only do one thing at a time, many programs have been written to work around this limitation. In fact, many programs that you would expect to be only one process aren't. The Apache Web server, for example, actually starts quite a few processes to serve up Web pages, allowing it to work well on multiple−processor systems.

SMP has long been a feature in commercial UNIX. Sun Microsystems just announced a 102−CPU SPARC system. Even Windows 2000 supports multiple CPUs, in a somewhat goofy way. I had an opportunity to take home a fourprocessor Intel 486 system at one point, and while I never would have used it, part of me regrets dragging it to the curb. Today a variety of manufacturers provide X86 SMP motherboards, including big−name dealers such as Dell and Compaq.

Kernel Assumptions

To understand SMP and the problems associated with it, we have to delve into the kernel. All operating systems face the same problems when supporting SMP, and the theory here is applicable across a variety of platforms. FreeBSD is somewhat different from other operating systems, though, because it has 30 years of UNIX heritage to deal with, and its development model doesn't allow work to stop for a month at a time.

Now, that said, let me say that what follows is a gross simplification. Kernel design is a tricky subject, and it's almost impossible to do it justice when describing it at a level for nonprogrammers. But here's an explanation of how it all works, in its most basic form.

Your computer appears to be doing many things simultaneously: For example, I have WindowMaker running, Netscape merrily soaking up the cable modem, and assorted port builds going on. Network interrupts are arriving, the screen is displaying new text, the Apache Web server is sending out

pages, and so on. Actually, all this only looks simultaneous. Your average CPU can only do one thing at a time.[3]

FreeBSD divides CPU utilization into time slices; a slice is the length of time the CPU spends doing one task. One process can use the CPU for either a full time slice or until there are no more tasks for it to do, at which point the next process may run. The kernel uses a priority−based system to allocate time slices and to determine which programs can run in which time slices. If a process is running, but a higher−priority process presents itself, the kernel allows the first process to be interrupted, or preempted. This is commonly referred to as preemptive multitasking.

Now, although the kernel is running, it isn't a process; processes are run by the kernel. A process has certain sorts of data structures set up by the kernel, and the kernel manipulates them as it sees fit. You can consider the kernel a special sort of process, one that is handled very differently from regular processes. It cannot be interrupted by other programs—you cannot type killall kernel and reboot the system. And traditionally the kernel doesn't show up in top and similar tools.

Older UNIX and FreeBSD kernels get around some of the SMP problems by declaring that the kernel is nonpreemptive and cannot be interrupted. This simplifies kernel management issues because it makes everything quite deterministic: When a part of the kernel allocates memory, it can count on that chunk of memory being there when it executes the next instruction. No other part of the kernel will grab that particular chunk of memory.

264

This situation changed (for the better) after version 2.2.

FreeBSD 3.0 SMP

The first implementation of FreeBSD SMP was pretty straightforward: Processes were scattered between the CPUs (achieving a rough load balance), and there was a "lock" on the kernel. The CPU had to hold this lock to run the kernel, and before a CPU would try to run the kernel, it checked to see if the lock was available. If the lock was available, it took the lock and ran the kernel. If the lock was unavailable, the CPU knew that the kernel was being run elsewhere and went on to handle something else. This lock was called the Big Giant Lock (BGL). Under this system, the kernel could know that data would not change from under it. Essentially, it guaranteed that the kernel would only run on one CPU, just as it always had.

This strategy worked well enough for two CPUs: You could run a mediumlevel database and Web server on a twin−CPU machine, and feel confident that the CPU wouldn't be your bottleneck. If one CPU was busy serving up Web pages, the other would be free to answer database calls. But if you wanted to run an eight−CPU machine, you were in trouble; the system would spend a lot of time just waiting for the Big Giant Lock to become available! The kernel still knew that it was only doing one thing at a time, and if a kernel instruction changed some internal value, it would still be that way when it returned.

There are many problems with this system, but fundamentally it's simplistic, and neither efficient nor scalable. In fact, the standard textbooks on SMP rarely mention this method of handling the kernel because it's so clunky. Still, it beats some other operating systems' methods of handling SMP. For example, a twinprocessor Windows 2000 system's default setup dedicates one processor to the user interface and uses the other processor for everything else. While the interface is snappy and the mouse doesn't drag when you load the system, I would hope that most people don't purchase SMP hardware to address graphical interface problems.

With the growth of system hardware, multiple−CPU systems will become very common in just a few years. For FreeBSD to continue to be a quality operating system, this problem must be addressed.

FreeBSD 5 SMP

One of the benefits of the BSDi/Walnut Creek merger was the release of the BSD/OS 5.0 code base to the FreeBSD development community. BSD/OS contains a great deal of proprietary information, so the source code cannot be released to the general public. Still, FreeBSD developers were able to read portions of the code. The most interesting part of this code was that multiple CPUs could be in the kernel at once—something that will be heavily implemented in version 5, and which will mark one of the big differences between FreeBSD version 4 and version 5.

To prevent information corruption, the new FreeBSD SMP system combines the Big Giant Lock with a smaller lock called a mutex. When a piece of the kernel wants to work on a chunk of data, it slaps a mutex over it. When another part of the kernel tries to access this mutex−locked data, it says, "Oh, I can't touch that," and either waits for the resource to become available or tries to allocate some other resource. The goal is to eliminate the Big Giant Lock, and to have all kernel operations only mutex−lock the small bits of data that they need. As the kernel's smaller systems are rewritten to take advantage of mutexes, their need to hold the BGL will be eliminated. According to Greg Lehey, a major FreeBSD developer and member of the SMP project, this method is expected to scale to beyond 32 processors.

265

NoteThe BGL could have been ripped out entirely and replaced with mutexes everywhere in one massive frenzy of hacking (as commercial OS vendors do), completing the process in only a couple of months, so why not do so? Because doing so would have meant that FreeBSD−current would have been utterly unusable for several months, and 5.0−release would have been poorly debugged. Too, the volunteer developers working on other parts of the system would have had nothing to do. (Telling volunteer developers that they can't do anything is an excellent way to lose them.)

This should give you enough understanding of how SMP works that you can administer it reasonably well. Now, let's look at the details of handling an SMP system.

Using SMP

When using SMP, remember that multiple processors don't necessarily make things go faster. One processor can handle a certain number of operations per second; a second processor just means that the computer can handle twice that many operations per second, but those operations are not necessarily faster.

Think of the CPU count as lanes on a road.[4] If you have one lane, you can move one car at a time past any one spot. If you have four lanes, you can move four cars past that spot. Although the four−lane road won't necessarily allow those cars to reach their destination more quickly, there'll be a lot more of them arriving at any one time. If you think this doesn't make a difference, contemplate what would happen if someone replaced your local freeway with a one−lane road. CPU bandwidth is important.

Most user processes don't have to worry about when to use SMP; a process just requests some CPU time and the kernel allocates it. The program doesn't worry about where this CPU time is coming from.

The problem with SMP occurs when you want to have one process use multiple CPUs. The short answer is, you can't do that unless the program is threaded. Threaded programs are written specifically to run on multiple processors. (Check the program documentation to see if the program is threaded.) Programs such as Apache, which run multiple processes to serve requests, are not threaded but might as well be. Taken as a whole, Apache takes excellent advantage of multiple CPUs.

SMP and Upgrades

The most common "problem" people encounter with SMP is when performing the default torture test, an upgrade from source. It appears that no matter what, the system never seems to use more than one CPU at a time. The "top" program will show that the system is 50 percent idle, no matter what.

Trust your eyes. If the system appears to be half idle, you're only using one of your CPUs. The make program that handles building software issues a command, waits for a response, then issues another command. Each of these subtasks might be assigned to a different CPU, but the actual make command won't try to do anything until that original process comes back successful. It only does one thing at a time.

You can get around this problem with make's −j flag, which tells make to run multiple processes simultaneously. The −j flag takes its own argument, the number of make processes to run:

266

...............................................................................................

# make −j4 buildworld

...............................................................................................

This line tells make to run four processes, and hopefully it will complete more quickly. This doesn't mean that your make will be completed in one−fourth the time, however; you still have other issues to contend with (see Chapter 14).

Note Not all programs can handle being built with the −j flag. At times, even buildworld fails. (There is some discussion of disabling support for make −j in buildworld, as it causes many problems.) It's worth trying, but if things go badly, you need to fall back to plain old serial make.

Multiple processors are not the be−all and end−all of high−performance computing. Your application must be written to take advantage of them. If it isn't, extra CPUs will not help.

[3]Some CPUs (the Alpha) can do multiple things at once. These dual−issue and quad−issue processors are slowly becoming more common. This is one reason why the Alpha was such wonderful technology, and why it's bad for us all that the Alpha is no more.

[4]This example assumes that everyone drives the speed limit, taking turns and not cutting each other off, and in general not acting like real drivers in any American city. Advanced Software Management

267