Solaris Crash Dumps

Updated 02 Jan 2003 Read time 3 min(s) 27 sec(s) (599 views). Solaris

If a SunOS/Solaris system crashes (also known as a "panic") it is possible to write the entire contents of memory to disk for later analysis. It is highly recommended to enable so called crash dumps on the SunOS/Solaris servers.

Note: Care must be taken that enough disk space is available (e.g. on large db servers with 500MB memory, it may not be possible to dump to swap space).

Enabling dumps

For Solaris 1 (SunOS 4.x) edit /etc/rc.local and uncomment the following lines:

#
# Enable savecore (default is disabled)
#
mkdir -p /var/crash/`hostname`
echo -n 'checking for crash dump... '
intr savecore /var/crash/`hostname`
echo ''

For Solaris 2 edit the file /etc/init.d/sysetup and uncomment the following lines:

##
## Enable savecore (default is disabled)
##
if [ ! -d /var/crash/`uname -n` ]
then
    mkdir -p /var/crash/`uname -n`
fi
echo 'checking for crash dump...\c '
savecore /var/crash/`uname -n`
echo ''

Optional tips

All crash dumps could have highly confidential information since they contain all application memory space at the time of the crash. I highly recommend to add the following line to the above files, to prevent unauthorised access to the dumps.

chown -R root.staff /var/crash
chmod -R 600 /var/crash

If the file minfree exits in the crash directory, the number in this file specifies how many kilobytes of space must remain free on this filesystem once savecore has completed.
Dump to a special (i.e. not the swap) device is possible. On Solaris 1, add the line to the kernel configuration file (assuming you want to use device sd1b) and rebuild the kernel (see man config):
```
config vmunix swap on sd1b
```
On Solaris 2, it's a bit trickier, adb must be used.
If the several system crashes are expected, compress previous dumps. They often compress to only 5% of the original size. The same goes for dumps which are archived.
Crash dumps MUST be analysed on the same OS version and architecture as they were created (with savecore).

Initial Crash Dump analysis

The following commands can be used to analyse what was going on in the system before the panic occurred.

Using CRASH:/etc/crash -d vmcore.0 -n vmunix.0/usr/sbin/crash -d vmcore.0 -n vmunix.0

Description	Solaris 1 (SunOS 4)	Solaris 2.x
What OS is this?	strings vmcore.0 \| grep SunOS	strings vmcore.0 \| grep SunOS
What host is this?	strings vmcore.0 \| grep machine	strings vmcore.0 \| grep machine
What processes were running?	ps -laxk vmunix.0 vmcore.0	use crash (see below)
Show system tables	pstat -T vmunix.0 vmcore.0
Show network stats	netstat vmunix.0 vmcore.0	netstat -d unix.0 vmcore.0
Show NFS stats	nfsstat -n vmunix.0 vmcore.0	nfsstat -n unix.0 vmcore.0
Show arp table	arp -c vmunix.0 vmcore.0	arp -a unix.0 vmcore.0
Show IPC stuff	ipcs -a -N vmunix.0 -C vmcore.0	ipcs -a -N unix.0 -C vmcore.0
crash help	> help	> help
Help on "p" command	> help p	> help p
Show processes	> p -e	> p -e
Lots of process details		> p -l
crash details		> status
Quit crash	> q	> q
Using the ADB debugger:	adb -k vmunix.0 vmcore.0	adb -k unix.0 vmcore.0
What was the panic message?	*panicstr/s	*panicstr/s
Hostname	hostname/s	$<utsname
OS Version	version/s	$<utsname
Domain	domainname/s	srpc_domain/s
Machine	sysname/s	$<utsname
Manufacturer		hw_provider/s
Crash Time/date	time/Y	TIME/y
Boot time/date	*boottime=Y	*time-(lbolt%0t100)=Y
Display system messages	msgbuf+10/s	msgbuf+14s
Recent message buffer (ring)	$<msgbuf	$<msgbuf
C stack traceback (not always right!)	$c	$c
Stack traceback	<sp$<stacktrace	??
What is root device?		rootfs$<bootobj
What is swap device?		swapfile$<bootobj dumpfile$<bootobj
Show registers	$cregs
Show IPC stuff	ipcaccess/10i
Quit adb	CTRL-D or $q	CTRL-D or $q
Access a live kernel):	adb -k /vmunix /dev/mem	adb -k /dev/ksyms /dev/mem

adb macros are located in /usr/lib/adb (Solaris 1) or /usr/kvm/lib/adb (Solaris 2).

Solaris Crash Dumps

Enabling dumps

Initial Crash Dump analysis

Further reading