Basic Solaris network troubleshooting

The aim of this post is to provide an brief insight into the basics of troubleshooting potential Solaris network services and issues using standard everyday tools ping, snoop, netstat and nfsstat.

Using ping

ping tells you if a host (schlumpf in these examples) is alive.

# ping schlumpf
schlumpf is alive

Sometimes you need more information. Use -s and ping will each second send a datagram (56 bytes) and give you round trip time (in milliseconds) and packet loss statistics.

# ping -s schlumpf

To change interval (in seconds) of sending datagrams use -I. So with command below I send datagram every 5 seconds.

# ping -s -I 5 schlumpf

If you want sending different size of datagram, use option like below (datagram is 1024 bytes).

# ping -s schlumpf 1024
PING schlumpf: 1024 data bytes
1032 bytes from schlumpf (213.64.10.2): icmp_seq=0. time=1.39 ms
Etc,etc

Using snoop

snoop captures packets from the network and displays their contents.

# snoop
Using device /dev/hme0 (promiscuous mode)

When you run snoop, the NIC will work in promiscuous mode, meaning the NIC will pass all traffic to CPU (not only traffic address to it).

You can capture only packets addresses to your machine (and broadcast/multicast) using option -P.# snoop -P Using device /dev/hme0 (non promiscuous) ....

You can capture packets in the file for later analysing.

# snoop -q -o /tmp/snoop.txt

-q doesn't show packet count which improves capturing performance (but at same time you may not capturing anything and you are not aware of this)

To view this file use:

# snoop -i /tmp/snoop.txt

Tips:

I like option -V since gives nice output (try it: snoop -V). Have more than one NIC? Use snoop -d hme0 (example for second NIC in Ultra-10) Use -r not to resolve IP and prevent snoop from generating network traffic. If you really need lots of details try: snoop -v To capture packets with host hostname as either source or destination: snoop hostname To capture packets between 2 hosts: snoop host1 host2

Using netstat

netstat shows network status. Some of the most common options are:

1. To show state of interface(s).

# netstat -i
Name  Mtu  Net/Dest Address   Ipkts   Ierrs Opkts Oerrs Collis Queue
lo0   8232 loopback localhost 400     0     400   0     0      0
hme0  1500 schlumpf schlumpf  1730873 0     60662 0     0      0

Divide number of Collision counts (Collis) with number of out packets (Opkts), multiple with 100 and if percentage is greater than 5-10% you my have a problem. The machine might be dropping packets if input error is over 0.25% (Ierrs x 100)/Ierrs Have more than one NIC? Use "-I" to specify it, see example for second NIC in SunFire V240 - capture info every 10 sec.

# netstat -I hme0 10
    input   hme0      output     input (Total)    output
packets errs  packets errs colls packets errs packets errs colls
303712691 0    160988476 0  0       936921091 0  540020651  0   0
21      0     27      0    0     87      0    94      0    0
3329    0     1690    0    0     3403    0    1761    0    0

2. To display routing table

# netstat -r
Routing Table: IPv4
Destination            Gateway              Flags Ref   Use        Interface
---------------------- -------------------- ----- ----- ---------- ---------
default                dns1.dyndns.org      UG    1     355
213.64.10.0            schlumpf             U     1     153        hme0
BASE-ADDRESS.MCAST.NET schlumpf             U     1     0          hme0
localhost              localhost            UH    1     0          lo0

3. To show statistic of UDP, TCP, ICMP, IGMP # netstat -s 4. Use -n to see IP addresses and who has established state with your host.

# netstat -n
TCP: IPv4
Local Address        Remote Address       Swind Send-Q Rwind Recv-Q State
-------------------- -------------------- ----- ------ ----- ------ -----------
213.64.10.2.37626    213.64.10.2.33326    49152 0      49152 0      ESTABLISHED
213.64.10.2.33326    213.64.10.2.37626    49152 0      49152 0      ESTABLISHED

5. To see state of all sockets use netstat -a

Using nfsstat

A little background on this: The NFS is using RPC that translates local command into request for remote system. While server is working on call and waits the result return, the client application is suspended.

If server doesn't respond, client retransmit request, which increase traffic. Too many retransmission affects NFS performance. Other problems are: overloaded server is slow, NIC is dropping packets, network congestion is slowing packets.

Use next option to see NFS statistics:

nfsstat -c client statistics nfsstat -s server statistics nfsstat -n server and client statistics nfsstat -r RPC statistics nfsstat -m network statistics for each mounted NFS

There are tons of outputs here and the ones you will probably look first are:

calls : number of calls (sent for client and receives for server) badcalls : number of rejected calls retrans : number of retransmissions