Troubleshooting 'too many open files'

In Solaris, every process is subjected to a System Kernel Parameter that defines the maximum number of files that it can have open. rlim_fd_cur specifies the 'soft' limit on file descriptors that a single process can have open. A process might adjust its file descriptor limit to any value up to the 'hard' limit defined instead by rlim_fd_max.

The following table shows the system defaults for various Solaris releases

SolarisSoft limit (rlim_fd_cur)Hard limit (rlim_fd_max)
2.6 & 7641024
82561024
9 & 1025665536

In order to increase the 'soft' (rlim_fd_cur) and 'hard' (rlim_fd_max) limits it is necessary to directly edit the root /etc/system file. For example:

To set the soft limit of the number of file descriptors per process to 1024:

set rlim_fd_cur = 1024

To set the hard limit of the number of file descriptors per process to 8192:

set rlim_fd_max = 8192

 

<>If you face the 'too many open files' error here are a few things you can try to identify the source of the problem:

  • Check current limits.
  • Check the limits of a running process.
  • Tracking a possible file descriptors leak.
  • Tracking open files in real time.

Check current limits

Using the ulimit -a command we can list the current limitations of the current session. The interesting field to notice is the nofile(s). Any process started by the current shell will inherit the limits by default.

For example, in Solaris:

# ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) unlimited
nofiles(descriptors) 256
vmemory(kbytes) unlimited

If the limit is too small, you may want to increase it using the ulimit command with the -n option, then relauch your application.

Check limits of a running process

There are some system calls that allow you to change the current limits of a process while it is running. Therefore the values might be different from the default ones inherited from the shell. To check the current settings for a running process you can investigate the /proc environment for the process

Under Solaris, we can use the plimit command. For example:

plimit 28903
28903:   ksh -o vi
   resource              current         maximum
  time(seconds)         unlimited       unlimited
  file(blocks)          unlimited       unlimited
  data(kbytes)          unlimited       unlimited
  stack(kbytes)         8192            unlimited
  coredump(blocks)      unlimited       unlimited
  nofiles(descriptors)  256             65536
  vmemory(kbytes)       unlimited       unlimited

Limit of file descriptors will show as 'nofiles(descriptors)'

Tracking possible file descriptor leaks

By regularly checking you would see the number growing on and on in case of a leak. However, keep in mind that the number of files descriptors growing does not always indicate a leak. It might simply be that the process needs to open a lot of files.

There are multiple ways to do this. The easiest is simply to investigate the '/proc' virtual file system to check how many files are opened by the process.

Under Solaris we can use ls /proc/<pid>/fd and pfiles <pid>. For example:

# ls -al /proc/29803/fd/
total 19
dr-x------   2 root     root        8208 May 31 10:40 .
dr-x--x--x   5 root     root         864 May 31 10:40 ..
c---------   1 mc84838  staff     24,  2 May 31 10:41 0
c---------   1 mc84838  staff     24,  2 May 31 10:41 1
c---------   1 mc84838  staff     24,  2 May 31 10:41 2

Whilst the ls /proc/<pid>/fd is fast it may not tell you which files are actually opened. Finding the list of open files with their names for a running process can be done using the pfiles <pid>. For example:

# pfiles 29803
29803:  ksh -o vi
  Current rlimit: 256 file descriptors
   0: S_IFCHR mode:0620 dev:270,0 ino:12582920 uid:84838 gid:10 rdev:24,2
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/pts@0:2
   1: S_IFCHR mode:0620 dev:270,0 ino:12582920 uid:84838 gid:10 rdev:24,2
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/pts@0:2
   2: S_IFCHR mode:0620 dev:270,0 ino:12582920 uid:84838 gid:10 rdev:24,2
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/pts@0:2

Tracking open files in real time

Tracking file descriptors usage in real time means that you have to monitor both the open() and close() system calls. To be more accurate you can use the same method to also track system calls like dup() and others like socket() that would create a new file descriptor.

To track in real time the usage of file descriptors you can use a Solaris debugger like dbx. You can also use system tools like dtrace or use system traces if available.

The preferred choice would be to use system tools as these are actually executing within kernel thus avoiding delays caused by debuggers.

Resources