Basic Solaris Network Troubleshooting
The aim of this post is to provide a brief insight into the basics of troubleshooting potential Solaris network services and issues using standard everyday tools ping
, snoop
, netstat
and nfsstat
.
Using ping
ping tells you if a host (schlumpf in these examples) is alive.
# ping schlumpf schlumpf is alive
Sometimes you need more information. Use -s
and ping will each second send a datagram (56 bytes) and give you round trip time (in milliseconds) and packet loss statistics.
# ping -s schlumpf
To change interval (in seconds) of sending datagrams use -I
. So with command below I send datagram every 5 seconds.
# ping -s -I 5 schlumpf
If you want sending different size of datagram, use option like below (datagram is 1024 bytes).
# ping -s schlumpf 1024 PING schlumpf: 1024 data bytes 1032 bytes from schlumpf (213.64.10.2): icmp_seq=0. time=1.39 ms Etc,etc
Using snoop
snoop captures packets from the network and displays their contents.
# snoop Using device /dev/hme0 (promiscuous mode)
When you run snoop, the NIC will work in promiscuous mode, meaning the NIC will pass all traffic to CPU (not only traffic address to it).
You can capture only packets addresses to your machine (and broadcast/multicast) using option -P.# snoop -P Using device /dev/hme0 (non promiscuous) ....
You can capture packets in the file for later analysing.
# snoop -q -o /tmp/snoop.txt
-q doesn't show packet count which improves capturing performance (but at same time you may not capturing anything and you are not aware of this)
To view this file use:
# snoop -i /tmp/snoop.txt
Tips:
I like option -V since gives nice output (try it: snoop -V). Have more than one NIC? Use snoop -d hme0 (example for second NIC in Ultra-10) Use -r not to resolve IP and prevent snoop from generating network traffic. If you really need lots of details try: snoop -v To capture packets with host hostname as either source or destination: snoop hostname To capture packets between 2 hosts: snoop host1 host2
Using netstat
netstat shows network status. Some of the most common options are:
1. To show state of interface(s).
# netstat -i Name Mtu Net/Dest Address Ipkts Ierrs Opkts Oerrs Collis Queue lo0 8232 loopback localhost 400 0 400 0 0 0 hme0 1500 schlumpf schlumpf 1730873 0 60662 0 0 0
Divide number of Collision counts (Collis) with number of out packets (Opkts), multiple with 100 and if percentage is greater than 5-10% you my have a problem. The machine might be dropping packets if input error is over 0.25% (Ierrs x 100)/Ierrs Have more than one NIC? Use "-I" to specify it, see example for second NIC in SunFire V240 - capture info every 10 sec.
# netstat -I hme0 10 input hme0 output input (Total) output packets errs packets errs colls packets errs packets errs colls 303712691 0 160988476 0 0 936921091 0 540020651 0 0 21 0 27 0 0 87 0 94 0 0 3329 0 1690 0 0 3403 0 1761 0 0
2. To display routing table
# netstat -r Routing Table: IPv4 Destination Gateway Flags Ref Use Interface ---------------------- -------------------- ----- ----- ---------- --------- default dns1.dyndns.org UG 1 355 213.64.10.0 schlumpf U 1 153 hme0 BASE-ADDRESS.MCAST.NET schlumpf U 1 0 hme0 localhost localhost UH 1 0 lo0
3. To show statistic of UDP, TCP, ICMP, IGMP # netstat -s 4. Use -n to see IP addresses and who has established state with your host.
# netstat -n TCP: IPv4 Local Address Remote Address Swind Send-Q Rwind Recv-Q State -------------------- -------------------- ----- ------ ----- ------ ----------- 213.64.10.2.37626 213.64.10.2.33326 49152 0 49152 0 ESTABLISHED 213.64.10.2.33326 213.64.10.2.37626 49152 0 49152 0 ESTABLISHED
5. To see state of all sockets use netstat -a
Using nfsstat
A little background on this: The NFS is using RPC that translates local command into request for remote system. While server is working on call and waits the result return, the client application is suspended.
If server doesn't respond, client retransmit request, which increase traffic. Too many retransmission affects NFS performance. Other problems are: overloaded server is slow, NIC is dropping packets, network congestion is slowing packets.
Use next option to see NFS statistics:
nfsstat -c client statistics nfsstat -s server statistics nfsstat -n server and client statistics nfsstat -r RPC statistics nfsstat -m network statistics for each mounted NFS
There are tons of outputs here and the ones you will probably look first are:
calls : number of calls (sent for client and receives for server) badcalls : number of rejected calls retrans : number of retransmissions