Solaris Crash Dumps

If a SunOS/Solaris system crashes (also known as a "panic") it is possible to write the entire contents of memory to disk for later analysis. It is highly recommended to enable so called crash dumps on the SunOS/Solaris servers.

Note: Care must be taken that enough disk space is available (e.g. on large db servers with 500MB memory, it may not be possible to dump to swap space).

Enabling dumps

  • For Solaris 1 (SunOS 4.x) edit /etc/rc.local and uncomment the following lines:
    # Enable savecore (default is disabled)
    mkdir -p /var/crash/`hostname`
    echo -n 'checking for crash dump... '
    intr savecore /var/crash/`hostname`
    echo '' 
  • For Solaris 2 edit the file /etc/init.d/sysetup and uncomment the following lines:
    ## Enable savecore (default is disabled)
    if [ ! -d /var/crash/`uname -n` ]
        mkdir -p /var/crash/`uname -n`
    echo 'checking for crash dump...\c '
    savecore /var/crash/`uname -n`
    echo ''

Optional tips

    • All crash dumps could have highly confidential information since they contain all application memory space at the time of the crash. I highly recommend to add the following line to the above files, to prevent unauthorised access to the dumps.
chown -R root.staff /var/crash
chmod -R 600 /var/crash
  • If the file minfree exits in the crash directory, the number in this file specifies how many kilobytes of space must remain free on this filesystem once savecore has completed.
  • Dump to a special (i.e. not the swap) device is possible. On Solaris 1, add the line to the kernel configuration file (assuming you want to use device sd1b) and rebuild the kernel (see man config):
    config vmunix swap on sd1b
    On Solaris 2, it's a bit trickier, adb must be used.
  • If the several system crashes are expected, compress previous dumps. They often compress to only 5% of the original size. The same goes for dumps which are archived.
  • Crash dumps MUST be analysed on the same OS version and architecture as they were created (with savecore).

Initial Crash Dump analysis

The following commands can be used to analyse what was going on in the system before the panic occurred.

Using CRASH:/etc/crash -d vmcore.0 -n vmunix.0/usr/sbin/crash -d vmcore.0 -n vmunix.0

DescriptionSolaris 1 (SunOS 4)Solaris 2.x
What OS is this? strings vmcore.0 | grep SunOS strings vmcore.0 | grep SunOS
What host is this? strings vmcore.0 | grep machine strings vmcore.0 | grep machine
What processes were running? ps -laxk vmunix.0 vmcore.0 use crash (see below)
Show system tables pstat -T vmunix.0 vmcore.0  
Show network stats netstat vmunix.0 vmcore.0 netstat -d unix.0 vmcore.0
Show NFS stats nfsstat -n vmunix.0 vmcore.0 nfsstat -n unix.0 vmcore.0
Show arp table arp -c vmunix.0 vmcore.0 arp -a unix.0 vmcore.0
Show IPC stuff ipcs -a -N vmunix.0 -C vmcore.0 ipcs -a -N unix.0 -C vmcore.0
crash help > help > help
Help on "p" command > help p > help p
Show processes > p -e > p -e
Lots of process details   > p -l
crash details   > status
Quit crash > q > q
Using the ADB debugger: adb -k vmunix.0 vmcore.0 adb -k unix.0 vmcore.0
What was the panic message? *panicstr/s *panicstr/s
Hostname hostname/s $<utsname
OS Version version/s $<utsname
Domain domainname/s srpc_domain/s
Machine sysname/s $<utsname
Manufacturer   hw_provider/s
Crash Time/date time/Y TIME/y
Boot time/date *boottime=Y *time-(lbolt%0t100)=Y
Display system messages msgbuf+10/s msgbuf+14s
Recent message buffer (ring) $<msgbuf $<msgbuf
C stack traceback
(not always right!)
$c $c
Stack traceback <sp$<stacktrace ??
What is root device?   rootfs$<bootobj
What is swap device?   swapfile$<bootobj
Show registers $cregs  
Show IPC stuff ipcaccess/10i  
Quit adb CTRL-D or $q CTRL-D or $q
Access a live kernel): adb -k /vmunix /dev/mem adb -k /dev/ksyms /dev/mem

adb macros are located in /usr/lib/adb (Solaris 1) or /usr/kvm/lib/adb (Solaris 2).

Further reading