39.12 What Makes Your Computer Slow? How Do You Fix It?

Article 39.5 discussed the various components that make up a user's perception of system performance. There is another equally important approach to this issue: the computer's view of performance. All system performance issues are basically resource contention issues. In any computer system, there are three fundamental resources: the CPU, memory, and the I/O subsystem (e.g., disks and networks). From this standpoint, performance tuning means ensuring that every user gets a fair share of available resources.

Each resource has its own particular set of problems. Resource problems are complicated because all resources interact with one another. Your best approach is to consider carefully what each system resource does: CPU, I/O, and memory. To get you started, here's a quick summary of each system resource and the problems it can have.

39.12.1 The CPU

On any time-sharing system, even single-user time-sharing systems (such as UNIX on a personal computer), many programs want to use the CPU at the same time. Under most circumstances the UNIX kernel is able to allocate the CPU fairly; however, each process (or program) requires a certain number of CPU cycles to execute and there are only so many cycles in a day. At some point the CPU just can't get all the work done.

There are a few ways to measure CPU contention. The simplest is the UNIX load average, reported by the BSD uptime ( 39.7 ) command. Under System V, sar -q provides the same sort of information. The load average tries to measure the number of active processes at any time (a process is a single stream of instructions). As a measure of CPU utilization, the load average is simplistic, poorly defined, but far from useless.

Before you blame the CPU for your performance problems, think a bit about what we don't mean by CPU contention. We don't mean that the system is short of memory or that it can't do I/O fast enough. Either of these situations can make your system appear very slow. But the CPU may be spending most of its time idle; therefore, you can't just look at the load average and decide that you need a faster processor. Your programs won't run a bit faster. Before you understand your system, you also need to find out what your memory and I/O subsystems are doing. Users often point their fingers at the CPU, but I would be willing to bet that in most situations memory and I/O are equally (if not more) to blame.

Given that you are short of CPU cycles, you have three basic alternatives:

If none of these options is viable, you may need to upgrade your system.

39.12.2 The Memory Subsystem

Memory contention arises when the memory requirements of the active processes exceed the physical memory available on the system; at this point, the system is out of memory. To handle this lack of memory without crashing the system or killing processes, the system starts paging : moving portions of active processes to disk in order to reclaim physical memory. At this point, performance decreases dramatically. Paging is distinguished from swapping , which means moving entire processes to disk and reclaiming their space. Paging and swapping indicate that the system can't provide enough memory for the processes that are currently running, although under some circumstances swapping can be a part of normal housekeeping. Under BSD UNIX, tools such as vmstat and pstat show whether the system is paging; ps can report the memory requirements of each process. The System V utility sar provides information about virtually all aspects of memory performance.

To prevent paging, you must either make more memory available or decrease the extent to which jobs compete. To do this, you can tune system parameters, which is beyond the scope of this book (see O'Reilly & Associates' System Performance Tuning by Mike Loukides for help). You can also terminate ( 38.10 ) the jobs with the largest memory requirements. If your system has a lot of memory, the kernel's memory requirements will be relatively small; the typical antagonists are very large application programs.

39.12.3 The I/O Subsystem

The I/O subsystem is a common source of resource contention problems. A finite amount of I/O bandwidth must be shared by all the programs (including the UNIX kernel) that currently run. The system's I/O buses can transfer only so many megabytes per second; individual devices are even more limited. Each kind of device has its own peculiarities and, therefore, its own problems. Unfortunately, UNIX has poor tools for analyzing the I/O subsystem. Under BSD UNIX, iostat can give you information about the transfer rates for each disk drive; ps and vmstat can give some information about how many processes are blocked waiting for I/O; and netstat and nfsstat report various network statistics. Under System V, sar can provide voluminous information about I/O efficiency, and sadp (V.4) can give detailed information about disk access patterns. However, there is no standard tool to measure the I/O subsystem's response to a heavy load.

The disk and network subsystems are particularly important to overall performance. Disk bandwidth issues have two general forms: maximizing per-process transfer rates and maximizing aggregate transfer rates. The per-process transfer rate is the rate at which a single program can read or write data. The aggregate transfer rate is the maximum total bandwidth that the system can provide to all programs that run.

Network I/O problems have two basic forms: a network can be overloaded or a network can lose data integrity. When a network is overloaded, the amount of data that needs to be transferred across the network is greater than the network's capacity; therefore, the actual transfer rate for any task is relatively slow. Network load problems can usually be solved by changing the network's configuration. Integrity problems occur when the network is faulty and intermittently transfers data incorrectly. In order to deliver correct data to the applications using the network, the network protocols may have to transmit each block of data many times. Consequently, programs using the network will run very slowly. The only way to solve a data integrity problem is to isolate the faulty part of the network and replace it.

39.12.4 User Communities

So far we have discussed the different factors that contribute to overall system performance. But we have ignored one of the most important factors: the users who submit the jobs.

In talking about the relationship between users and performance, it is easy to start seeing users as problems: the creatures who keep your system from running the way it ought to. Nothing is further from the truth. Computers are tools: they exist to help users do their work and not vice versa.

Limitations on memory requirements, file size, job priorities, etc., are effective only when everyone cooperates. Likewise, you can't force people to submit their jobs to a batch queue ( 40.6 ) . Most people will cooperate when they understand a problem and what they can do to solve it. Most people will resist a solution that is imposed from above, that they don't understand, or that seems to get in the way of their work.

The nature of your system's users has a big effect on your system's performance. We can divide users into several classes:

All three groups can cause problems. Several dozen users running grep and accessing remote filesystems can be as bad for overall performance as a few users accessing gigabyte files. However, the types of problems these groups cause are not the same. For example, setting up a "striped filesystem" will help disk performance for large, I/O-bound jobs but won't help (and may hurt) users who run many small jobs. Setting up batch queues will help reduce contention among large jobs, which can often be run overnight, but it won't help the system if its problems arise from users typing at their text editors and reading their mail.

Modern systems with network facilities ( 1.33 ) complicate the picture even more. In addition to knowing what kinds of work users do, you also need to know what kind of equipment they use: a standard terminal over an RS-232 line, an X terminal over Ethernet, or a diskless workstation? The X Window System requires a lot of memory and puts a heavy load on the network. Likewise, diskless workstations place a load on the network. Similarly, do users access local files or remote files via NFS or RFS?

- ML from O'Reilly & Associates' System Performance Tuning , Chapter 1

