Process monitoring using prstat

Using the top command is at the top of most Solaris administrators, especially those who cross platform with Linus OS. However some may use prstat but don't see benefits of using its format which somewhat differs from top.

As a seasoned Solaris veteran I'm part of an elite team who know how to really take advantage of what prstat has to offer as a true Solaris sysadmins tool. I hope these command examples will hopefully teach you a thing or two.

In its simplest form, the command prstat <interval> will examine all processes and report statistics sorted by CPU usage.

  PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
24382 mchurchi   11M 3236K cpu0    59    0   0:00:00 0.0% prstat/1
24360 mchurchi   18M 4788K sleep   59    0   0:00:00 0.0% sshd/1
24361 mchurchi   10M 2188K sleep   59    0   0:00:00 0.0% bash/1
  584 root       13M 3832K sleep   59    0   0:01:59 0.0% nscd/51
  154 mchurchi   13M 2068K sleep   59    0   0:00:00 0.0% gnome-termin/2
  183 root     1772K  776K sleep   59    0   0:00:13 0.0% hald/1
  538 mchurchi   11M 2572K sleep   59    0   0:00:00 0.0% nautilus/4
  655 mchurchi 7316K 4288K sleep   59    0   0:01:53 0.0% xscreensaver/4
  163 root     3220K 1244K sleep   59    0   0:00:00 0.0% cron/3
  964 noaccess   10M 1760K sleep   59    0   0:00:01 0.0% ttymon/2
  565 root       14M 1776K sleep   59    0   0:00:00 0.0% bash/1
  475 root       11M  808K sleep   59    0   0:00:00 0.0% net-physical/1
  105 root     9672K  868K sleep   59    0   0:01:04 0.0% in.mpathd/1
   48 netcfg   3784K 1740K sleep   59    0   0:00:43 0.0% dbus-daemon/4
  114 root     2124K 1164K sleep   59    0   0:00:00 0.0% pfexecd/3
   89 netadm   4240K 2124K sleep   59    0   0:00:00 0.0% ipmgmtd/5
  257 daemon     14M 3400K sleep   59    0   0:00:00 0.0% syseventd/3
  657 root     4076K 1164K sleep   59    0   0:00:00 0.0% hald-runner/1
   43 root       15M 3424K sleep   59    0   0:00:41 0.0% mixer-applet/5
  824 root     1980K 1220K sleep   59    0   0:00:00 0.0% powernowd/3
   13 root       27M   25M sleep   59    0   0:03:00 0.0% gconfd/22
   11 root       14M   11M sleep   59    0   0:00:42 0.0% svc.startd/13
  606 root     8784K  856K sleep   59    0   0:00:00 0.0% iscsid/2
    8 root        0K    0K sleep   99  -20   0:00:00 0.0% vmtasks/1
    7 root        0K    0K sleep   60    -   0:00:26 0.0% intrd/1
Total: 78 processes, 214 lwps, load averages: 0.89, 0.39, 0.35

In the above example, processes are ordered from top (highest) to bottom (lowest) according to their current CPU usage (in % - 100% means all system CPUs are fully utilized). For each process in the list, following information is printed:

  • PID: the process ID of the process.
  • USERNAME: the real user (login) name or real user ID.
  • SIZE: the total virtual memory size of the process, including all mapped files and devices, in kilobytes (K), megabytes (M), or gigabytes (G).
  • RSS: the resident set size of the process (RSS), in kilobytes (K), megabytes (M), or gigabytes (G).
  • STATE: the state of the process (cpuN/sleep/wait/run/zombie/stop).
  • PRI: the priority of the process. Larger numbers mean higher priority.
  • NICE: nice value used in priority computation. Only processes in certain scheduling classes have a nice value.
  • TIME: the cumulative execution time for the process.
  • CPU: The percentage of recent CPU time used by the process. If executing in a non-global zone and the pools facility is active, the percentage will be that of the processors in the processor set in use by the pool to which the zone is bound.
  • PROCESS: the name of the process (name of executed file).
  • NLWP: the number of lwps in the process.

The <interval> argument given to prstat is the sampling/refresh interval in seconds.

Whilst the above example is useful as we can easily see the top consumers of CPU.

Special Report - Sorting

The prstat output can be sorted by another criteria than CPU usage. Use the option -s (descending) or -S (ascending) with the criteria of choice

CriteriaComments
cpu Sort by process CPU usage. This is the default.
pri Sort by process priority.
rss Sort by resident set size.
size Sort by size of process image.
time Sort by process execution time.

prstat with additional reports about users

If you run prstat with the -a option you will get an output similar to the default one, but the last few lines of it will be used for providing a really useful report of the users consuming top system resources.

  PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
 24383 mchurchi   11M 3236K cpu0    59    0   0:00:00 0.0% prstat/1
 24360 mchurchi   18M 4788K sleep   59    0   0:00:00 0.0% sshd/1
 24361 mchurchi   10M 2192K sleep   59    0   0:00:00 0.0% bash/1
        :
        :
     7 root        0K    0K sleep   60    -   0:00:26 0.0% intrd/1
 NPROC USERNAME  SWAP   RSS MEMORY      TIME  CPU
     5 mchurchi   52M   13M   1.3%   0:00:00 0.0%
    50 root      841M  571M    56%   0:22:22 0.0%
     2 noaccess   20M 3776K   0.4%   0:00:43 0.0%
     1 netcfg   3784K 1740K   0.2%   0:00:43 0.0%
     1 netadm   4240K 2124K   0.2%   0:00:00 0.0%
     2 daemon     17M 4520K   0.4%   0:00:04 0.0%
Total: 78 processes, 214 lwps, load averages: 0.89, 0.39, 0.35

prstat and microstate accounting

If you want to get down to really low level details of your system's wellbeing in terms of CPU and memory usage, you'll love the microstate accounting support in prstate.

Activated by the -m option (prstat -m), this option will show you lots of columns with percentage numbers confirming how and what exactly each process (or LWP thread) is doing.

Useful applications of microstate accounting include confirming how long a thread spends in a sleep mode and whether it's lacking CPU resources (shown by CPU latency).

Special Report — by Zones

With the command prstat -Z additional reports about zones are printed.

Microstate Accounting

Unlike other operating systems that gather CPU statistics every clock tick or every fixed time interval (typically every hundredth of a second), Solaris 10 incorporates a technology called microstate accounting that uses high-resolution timestamps to measure CPU statistics for every event, thus producing extremely accurate statistics.

The microstate accounting system maintains accurate time counters for threads as well as CPUs. Thread-based microstate accounting tracks several meaningful states per thread in addition to user and system time, which include trap time, lock time, sleep time and latency time. prstat -m reports the per-process, and prstat -mLM reports per-thread microstates.

To summarise

I really like prstat because it gives me access to the following information:

  • microstate accounting with LOTS of CPU info
  • CPU usage stats across global and non-global zones
  • provide reports (multiple screens of stats taken at specific intervals) forwarded to a file
  • do really useful summaries about top users consuming your OS resources