Troubleshooting various NFS errors

Arguably the most difficult task in any area of computers is identifying and solving problems and NFS is no exception. The good news is that the common steps used in troubleshooting work in all areas such as identifying the problem, isolating the causes of the problems by the process of elimination, and so on. The first step is always identifying the problem, and the error messages help in this step.

Determine the NFS version

To determine what version and transport of NFS is currently available, run rpcinfo on the NFS server.

# rpcinfo -p | grep 100003
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100003 2 tcp 2049 nfs
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs

From the about output the second column shows the NFS version and the third column is the transport protocol.

Sun has implemented the following versions of NFS on it's operating systems, for both client and server:

OS VersionNFSv2NFSv3NFSv4
SunOS 4.x UDP    
Solaris 2.0 and below UDP    
Solaris 2.5x, 2.6, 7, 8, 9 UDP and/or TCP UDP and/or TCP  
Solaris 10 UDP and/or TCP UDP and/or TCP TCP*

* The UDP transport is not supported in NFSv4, as it does not contain the required congestion control methods.

Common NFS errors

The table below describes common NFS error messages and possible solutions

Error MessageDescriptionPossible Solution
Could not start <daemon>: <error>. This message will be displayed if the <daemon> terminates abnormally or if a system call generates an error. The <error> string states the problem, This error message is rare and has no straightforward solution. Contact Sun support in this case, for help.
Cannot establish NFS service over /dev/tcp: transport setup problem. Probably the services information in the namespace has not been updated. Update the services information in the namespace for NIS/NIS +
<daemon> running already with pid <pid>. The daemon specified by <daemon> that you are trying to start is already running. If you have to restart it, issue the restart command, or stop it first and then start it.
<filename>: File too large. An NFS version 2 client is attempting to access a file that is larger than 2GB. Do not use NFS version 2. Mount the file system with NFS version 3 or version 4.
NFS server recovering. During the server reboot, some operations were not permitted, so the client is waiting for the server to permit this operation to proceed. No action is required. Wait for the server to get to the point where it can permit the operation.
NFS server <hostname> not responding still trying. NFS server specified by <hostname> is down, or there is some other problem with the server or the network. Troubleshoot for connectivity.
NFS file temporarily unavailable on the server, retrying . . . The server is recalling a delegation for another client that conflicts with the request from your client, The server recall must occur before the server can process your client's request.
mount: . . . No such file or directory Either the remote directory or the local directory does not exist. Check the spelling for the directory names, and run the ls command for both directories.

When trying to narrow down an NFS problem, remember the main suspects of a possible failure: the server, the client, and the network. Try to isolate each individual component to find the one that is not working.

First of all, note that the mountd and nfsd daemons must be running on the server for remote mounts to succeed.

To start the process of isolating the problem, perform the following initial steps:

  • Check whether the client machine can access the server machine.
  • Check whether the client can contact the NFS services on the server.
  • Check whether the NFS services are running on the server.

In the process of checking these items you might discover problems with other parts of the network; in that case, continue isolating and narrowing down the problem.

You can use the ping command to check the reachability of a machine. Check the reachability of a client from a server and vice versa. Then, you can check whether the NFS services are running. For example, to check remotely whether the NFS service has started on the server, issue the following command:

# rpcinfo -s <serverName> | egrep 'nfs|mountd'

To check whether the nfsd daemon on the server is responding, issue the following command on the client machine:

# rpcinfo -t <serverName> nfs

This checks the NFS connection of the client with the server over TCP. If you want to check whether the daemon mountd is running on the server, issue the following command on the client machine:

# rpcinfo -t <serverName> mountd

If the server is running, it prints a list of programs and version numbers.

You can also use the rpcinfo command by logging on to the server machine, such as the following:

# rpcinfo -t localhost nfs

If the server is running, it displays a list of programs and version numbers associated with the TCP protocol.

Example troubleshooting an NFS client

The main troubleshooting tools for NFS are the ping and rpcinfo commands. Of course, you can always use the svcs command to check whether the NFS service is running.

Suppose you are troubleshooting the connectivity between the NFS client machine to which you are logged on with the server machine named mars. Perform the following steps

  1. Check the reachability of the server by issuing the following command on the client:
    # ping mars
  2. If the server is reachable, investigate the server--for example, with the rpcinfo command. If the server was unreachable, continue investigating the client. Next, on the client, make sure that the local name service is running by issuing the following command:
    # /usr/lib/nis/nisping -u
  3. If you find that the service is running, make sure that the client has received the correct host information by issuing the following command:
    # /usr/bin/getent hosts mars
    10.1.1.12 mars.lab
  4. Suppose the host information is correct, but you already found that the server was not reachable. In that case, try to reach the server from another client by using the ping command:
    1. If the server is not reachable from the second client, the server needs to be investigated—for example, by using the rpcinfo command
    2. If the server is reachable from the second client, continue investigating this client.
  5. Issue the ping command on this client to check its connectivity with other machines on the network.
  6. If the ping command fails, check the configuration files on the client, such as /etc/netmasks, and/etc/nsswitch.conf.
  7. Issue the rpcinfo command on the client to see whether it displays something like the following:
    program 100003 version 4 ready and waiting
    If it does not display this, then NFS version 4 is not enabled. In this case enable the NFS service.
  8. If you have not yet found the problem, you should check the hardware.