Table of Contents
In addition to screen oriented monitors and tools, the NetBSD system also ships with a set of command line oriented tools. Many of the tools that ship with a NetBSD system can be found on other UNIX and UNIX-like systems.
The fstat(1) utility reports the status of open files on the system, while it is not what many administrators consider a performance monitor, it can help find out if a particular user or process is using an inordinate amount of files, generating large files and similar information.
Following is a sample of some fstat output:
USER CMD PID FD MOUNT INUM MODE SZ|DV R/W jrf tcsh 21607 wd / 29772 drwxr-xr-x 512 r jrf tcsh 21607 3* unix stream c057acc0<-> c0553280 jrf tcsh 21607 4* unix stream c0553280 <-> c057acc0 root sshd 21597 wd / 2 drwxr-xr-x 512 r root sshd 21597 0 / 11921 crw-rw-rw- null rw nobody httpd 5032 wd / 2 drwxr-xr-x 512 r nobody httpd 5032 0 / 11921 crw-rw-rw- null r nobody httpd 5032 1 / 11921 crw-rw-rw- null w nobody httpd 5032 2 / 15890 -rw-r--r-- 353533 rw ...
The fields are pretty self explanatory, again, this tool while not as performance oriented as others, can come in handy when trying to find out information about file usage.
The iostat(8) command does exactly what it sounds like, it reports the status of the I/O subsystems on the system. When iostat is employed, the user typically runs it with a certain number of counts and an interval between them like so:
ftp% iostat 5 5 tty wd0 cd0 fd0 md0 cpu tin tout KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s us ni sy in id 0 1 5.13 1 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100 0 54 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100 0 18 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100 0 18 8.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100 0 28 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100
The above output is from a very quiet ftp server. The fields represent the various I/O devices, the tty (which, ironically, is the most active because iostat is running), wd0 which is the primary IDE disk, cd0, the cdrom drive, fd0, the floppy and the memory filesystem.
Now, lets see if we can pummel the system with some heavy usage. First, a large ftp transaction consisting of a tarball of netbsd-current source along with the bonnie++ disk benchmark program running at the same time.
ftp% iostat 5 5 tty wd0 cd0 fd0 md0 cpu tin tout KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s us ni sy in id 0 1 5.68 1 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100 0 54 61.03 150 8.92 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 1 0 18 4 78 0 26 63.14 157 9.71 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 1 0 20 4 75 0 20 43.58 26 1.12 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0 0 9 2 88 0 28 19.49 82 1.55 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 1 0 7 3 89
As can be expected, notice that wd0 is very active, what is interesting about this output is how the processor's I/O seems to rise in proportion to wd0. This makes perfect sense, however, it is worth noting that only because this ftp server is hardly being used can that be observed. If, for example, the cpu I/O subsystem was already under a moderate load and the disk subsystem was under the same load as it is now, it could appear that the cpu is bottlenecked when in fact it would have been the disk. In such a case, we can observe that "one tool" is rarely enough to completely analyze a problem. A quick glance at processes probably would tell us (after watching iostat) which processes were causing problems.
Using the ps(1) command or process status, a great deal of information about the system can be discovered. Most of the time, the ps command is used to isolate a particular process by name, group, owner etc. Invoked with no options or arguments, ps simply prints out information about the user executing it.
ftp% ps PID TT STAT TIME COMMAND 21560 p0 Is 0:00.04 -tcsh 21564 p0 I+ 0:00.37 ssh jrf.odpn.net 21598 p1 Ss 0:00.12 -tcsh 21673 p1 R+ 0:00.00 ps 21638 p2 Is+ 0:00.06 -tcsh
Not very exciting. The fields are self explanatory with the exception of STAT which is actually the state a process is in. The flags are all documented in the man page, however, in the above example, I is idle, S is sleeping, R is runnable, the + means the process is in a foreground state, and the s means the process is a session leader. This all makes perfect sense when looking at the flags, for example, PID 21560 is a shell, it is idle and (as would be expected) the shell is the process leader.
In most cases, someone is looking for something very specific in the process listing. As an example, looking at all processes is specified with -a, to see all processes plus those without controlling terminals is -ax. and to get a much more verbose listing (basically everything plus information about the impact processes are having) aux:
ftp# ps aux USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 0 0.0 9.6 0 6260 ?? DLs 16Jul02 0:01.00 (swapper) root 23362 0.0 0.8 144 488 ?? S 12:38PM 0:00.01 ftpd -l root 23328 0.0 0.4 428 280 p1 S 12:34PM 0:00.04 -csh jrf 23312 0.0 1.8 828 1132 p1 Is 12:32PM 0:00.06 -tcsh root 23311 0.0 1.8 388 1156 ?? S 12:32PM 0:01.60 sshd: jrf@ttyp1 jrf 21951 0.0 1.7 244 1124 p0 S+ 4:22PM 0:02.90 ssh jrf.odpn.net jrf 21947 0.0 1.7 828 1128 p0 Is 4:21PM 0:00.04 -tcsh root 21946 0.0 1.8 388 1156 ?? S 4:21PM 0:04.94 sshd: jrf@ttyp0 nobody 5032 0.0 2.0 1616 1300 ?? I 19Jul02 0:00.02 /usr/pkg/sbin/httpd ...
Again, most of the fields are self explanatory with the exception of VSZ and RSS which can be a little confusing. RSS is the real size of a process in 1024 byte units while VSZ is the virtual size. This is all great, but again, how can ps help? Well, for one, take a look at this modified version of the same output:
ftp# ps aux USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 0 0.0 9.6 0 6260 ?? DLs 16Jul02 0:01.00 (swapper) root 23362 0.0 0.8 144 488 ?? S 12:38PM 0:00.01 ftpd -l root 23328 0.0 0.4 428 280 p1 S 12:34PM 0:00.04 -csh jrf 23312 0.0 1.8 828 1132 p1 Is 12:32PM 0:00.06 -tcsh root 23311 0.0 1.8 388 1156 ?? S 12:32PM 0:01.60 sshd: jrf@ttyp1 jrf 21951 0.0 1.7 244 1124 p0 S+ 4:22PM 0:02.90 ssh jrf.odpn.net jrf 21947 0.0 1.7 828 1128 p0 Is 4:21PM 0:00.04 -tcsh root 21946 0.0 1.8 388 1156 ?? S 4:21PM 0:04.94 sshd: jrf@ttyp0 nobody 5032 9.0 2.0 1616 1300 ?? I 19Jul02 0:00.02 /usr/pkg/sbin/httpd ...
Given that on this server, our baseline indicates a relatively quiet system, the PID 5032 has an unusually large amount of %CPU. Sometimes this can also cause high TIME numbers. The ps command can be grepped on for PIDs, username and process name and hence help track down processes that may be experiencing problems.
Using vmstat(1), information pertaining to virtual memory can be monitored and measured. Not unlike iostat, vmstat can be invoked with a count and interval. Following is some sample output using 5 5 like the iostat example:
vmstat 5 5 procs memory page disks faults cpu r b w avm fre flt re pi po fr sr w0 c0 f0 m0 in sy cs us sy id 0 7 0 17716 33160 2 0 0 0 0 0 1 0 0 0 105 15 4 0 0 100 0 7 0 17724 33156 2 0 0 0 0 0 1 0 0 0 109 6 3 0 0 100 0 7 0 17724 33156 1 0 0 0 0 0 1 0 0 0 105 6 3 0 0 100 0 7 0 17724 33156 1 0 0 0 0 0 0 0 0 0 107 6 3 0 0 100 0 7 0 17724 33156 1 0 0 0 0 0 0 0 0 0 105 6 3 0 0 100
Yet again, relatively quiet, for posterity, the exact same load that was put on this server in the iostat example will be used. The load is a large file transfer and the bonnie benchmark program.
vmstat 5 5 procs memory page disks faults cpu r b w avm fre flt re pi po fr sr w0 c0 f0 m0 in sy cs us sy id 1 8 0 18880 31968 2 0 0 0 0 0 1 0 0 0 105 15 4 0 0 100 0 8 0 18888 31964 2 0 0 0 0 0 130 0 0 0 1804 5539 1094 31 22 47 1 7 0 18888 31964 1 0 0 0 0 0 130 0 0 0 1802 5500 1060 36 16 49 1 8 0 18888 31964 1 0 0 0 0 0 160 0 0 0 1849 5905 1107 21 22 57 1 7 0 18888 31964 1 0 0 0 0 0 175 0 0 0 1893 6167 1082 1 25 75
Just a little different. Notice, since most of the work was I/O based, the actual memory used was not very much. Since this system uses mfs for /tmp, however, it can certainly get beat up. Have a look at this:
ftp# vmstat 5 5 procs memory page disks faults cpu r b w avm fre flt re pi po fr sr w0 c0 f0 m0 in sy cs us sy id 0 2 0 99188 500 2 0 0 0 0 0 1 0 0 0 105 16 4 0 0 100 0 2 0111596 436 592 0 587 624 586 1210 624 0 0 0 741 883 1088 0 11 89 0 3 0123976 784 666 0 662 643 683 1326 702 0 0 0 828 993 1237 0 12 88 0 2 0134692 1236 581 0 571 563 595 1158 599 0 0 0 722 863 1066 0 9 90 2 0 0142860 912 433 0 406 403 405 808 429 0 0 0 552 602 768 0 7 93
Pretty scary stuff. That was created by running bonnie in /tmp on a memory based filesystem. If it continued for too long, it is possible the system could have started thrashing. Notice that even though the vm subsystem was taking a beating, the processors still were not getting too battered.