Table of Contents
NetBSD ships a variety of performance monitoring tools with the system. Most of these tools are common on all UNIX systems. In this section some example usage of the tools is given with interpretation of the output.
The top monitor does exactly what it says, it displays the CPU hogs on the system. To run the monitor, simply type top at the prompt. Without any arguments, it should look like:
load averages: 0.09, 0.12, 0.08 20:23:41 21 processes: 20 sleeping, 1 on processor CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Memory: 15M Act, 1104K Inact, 208K Wired, 22M Free, 129M Swap free PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 13663 root 2 0 1552K 1836K sleep 0:08 0.00% 0.00% httpd 127 root 10 0 129M 4464K sleep 0:01 0.00% 0.00% mount_mfs 22591 root 2 0 388K 1156K sleep 0:01 0.00% 0.00% sshd 108 root 2 0 132K 472K sleep 0:01 0.00% 0.00% syslogd 22597 jrf 28 0 156K 616K onproc 0:00 0.00% 0.00% top 22592 jrf 18 0 828K 1128K sleep 0:00 0.00% 0.00% tcsh 203 root 10 0 220K 424K sleep 0:00 0.00% 0.00% cron 1 root 10 0 312K 192K sleep 0:00 0.00% 0.00% init 205 root 3 0 48K 432K sleep 0:00 0.00% 0.00% getty 206 root 3 0 48K 424K sleep 0:00 0.00% 0.00% getty 208 root 3 0 48K 424K sleep 0:00 0.00% 0.00% getty 207 root 3 0 48K 424K sleep 0:00 0.00% 0.00% getty 13667 nobody 2 0 1660K 1508K sleep 0:00 0.00% 0.00% httpd 9926 root 2 0 336K 588K sleep 0:00 0.00% 0.00% sshd 200 root 2 0 76K 456K sleep 0:00 0.00% 0.00% inetd 182 root 2 0 92K 436K sleep 0:00 0.00% 0.00% portsentry 180 root 2 0 92K 436K sleep 0:00 0.00% 0.00% portsentry 13666 nobody -4 0 1600K 1260K sleep 0:00 0.00% 0.00% httpd
The top utility is great for finding CPU hogs, runaway processes or groups of processes that may be causing problems. The output shown above indicates that this particular system is in good health. Now, the next display should show some very different results:
load averages: 0.34, 0.16, 0.13 21:13:47 25 processes: 24 sleeping, 1 on processor CPU states: 0.5% user, 0.0% nice, 9.0% system, 1.0% interrupt, 89.6% idle Memory: 20M Act, 1712K Inact, 240K Wired, 30M Free, 129M Swap free PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 5304 jrf -5 0 56K 336K sleep 0:04 66.07% 19.53% bonnie 5294 root 2 0 412K 1176K sleep 0:02 1.01% 0.93% sshd 108 root 2 0 132K 472K sleep 1:23 0.00% 0.00% syslogd 187 root 2 0 1552K 1824K sleep 0:07 0.00% 0.00% httpd 5288 root 2 0 412K 1176K sleep 0:02 0.00% 0.00% sshd 5302 jrf 28 0 160K 620K onproc 0:00 0.00% 0.00% top 5295 jrf 18 0 828K 1116K sleep 0:00 0.00% 0.00% tcsh 5289 jrf 18 0 828K 1112K sleep 0:00 0.00% 0.00% tcsh 127 root 10 0 129M 8388K sleep 0:00 0.00% 0.00% mount_mfs 204 root 10 0 220K 424K sleep 0:00 0.00% 0.00% cron 1 root 10 0 312K 192K sleep 0:00 0.00% 0.00% init 208 root 3 0 48K 432K sleep 0:00 0.00% 0.00% getty 210 root 3 0 48K 424K sleep 0:00 0.00% 0.00% getty 209 root 3 0 48K 424K sleep 0:00 0.00% 0.00% getty 211 root 3 0 48K 424K sleep 0:00 0.00% 0.00% getty 217 nobody 2 0 1616K 1272K sleep 0:00 0.00% 0.00% httpd 184 root 2 0 336K 580K sleep 0:00 0.00% 0.00% sshd 201 root 2 0 76K 456K sleep 0:00 0.00% 0.00% inetd
At first, it should seem rather obvious which process is hogging the system, however, what is interesting in this case is why. The bonnie program is a disk benchmark tool which can write large files in a variety of sizes and ways. What the previous output indicates is only that the bonnie program is a CPU hog, but not why.
A careful examination of the manual page top(1) for top shows that there is a lot more that can be done with it, for example, processes can have their priority changed and killed. Additionally, filters can be set for looking at processes.
As the man page systat(1) indicates, the systat utility shows a variety of system statistics using the curses library. While it is running the screen is shown in two parts, the upper window shows the current load average while the lower screen depends on user commands. The exception to the split window view is when vmstat display is on which takes up the whole screen. Following is what systat looks like on a fairly idle system with no arguments given when it was invoked:
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10 Load Average | /0 /10 /20 /30 /40 /50 /60 /70 /80 /90 /100 <idle> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Basically a lot of dead time there, so now have a look with some arguments provided, in this case, systat inet.tcp which looks like this:
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10 Load Average | 0 connections initiated 19 total TCP packets sent 0 connections accepted 11 data 0 connections established 0 data (retransmit) 8 ack-only 0 connections dropped 0 window probes 0 in embryonic state 0 window updates 0 on retransmit timeout 0 urgent data only 0 by keepalive 0 control 0 by persist 29 total TCP packets received 11 potential rtt updates 17 in sequence 11 successful rtt updates 0 completely duplicate 9 delayed acks sent 0 with some duplicate data 0 retransmit timeouts 4 out of order 0 persist timeouts 0 duplicate acks 0 keepalive probes 11 acks 0 keepalive timeouts 0 window probes 0 window updates
Now that is informative. The first poll is accumulative, so it is possible to see quite a lot of information in the output when systat is invoked. Now, while that may be interesting, how about a look at the buffer cache with systat bufcache:
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10 Load Average There are 1642 buffers using 6568 kBytes of memory. File System Bufs used % kB in use % Bufsize kB % Util % / 877 53 6171 93 6516 99 94 /var/tmp 5 0 17 0 28 0 60 Total: 882 53 6188 94 6544 99
Again, a pretty boring system, but great information to have available. While this is all nice to look at, it is time to put a false load on the system to see how systat can be used as a performance monitoring tool. As with top, bonnie++ will be used to put a high load on the I/O subsystems and a little on the CPU. The bufcache will be looked at again to see of there are any noticeable differences:
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10 Load Average ||| There are 1642 buffers using 6568 kBytes of memory. File System Bufs used % kB in use % Bufsize kB % Util % / 811 49 6422 97 6444 98 99 Total: 811 49 6422 97 6444 98
First, notice that the load average shot up, this is to be expected of course, then, while most of the numbers are close, notice that utilization is at 99%. Throughout the time that bonnie++ was running the utilization percentage remained at 99, this of course makes sense, however, in a real troubleshooting situation, it could be indicative of a process doing heavy I/O on one particular file or filesystem.