Troubleshooting 'too many open files'
In Solaris, every process is subjected to a System Kernel Parameter that defines the maximum number of files that it can have open. rlim_fd_cur
specifies the 'soft' limit on file descriptors that a single process can have open. A process might adjust its file descriptor limit to any value up to the 'hard' limit defined instead by rlim_fd_max
.
The following table shows the system defaults for various Solaris releases
Solaris | Soft limit (rlim_fd_cur) | Hard limit (rlim_fd_max) |
---|---|---|
2.6 & 7 | 64 | 1024 |
8 | 256 | 1024 |
9 & 10 | 256 | 65536 |
In order to increase the 'soft' (rlim_fd_cur
) and 'hard' (rlim_fd_max
) limits it is necessary to directly edit the root /etc/system
file. For example:
To set the soft limit of the number of file descriptors per process to 1024:
set rlim_fd_cur = 1024
To set the hard limit of the number of file descriptors per process to 8192:
set rlim_fd_max = 8192
<>If you face the 'too many open files' error here are a few things you can try to identify the source of the problem:
- Check current limits.
- Check the limits of a running process.
- Tracking a possible file descriptors leak.
- Tracking open files in real time.
Check current limits
Using the ulimit -a
command we can list the current limitations of the current session. The interesting field to notice is the nofile(s)
. Any process started by the current shell will inherit the limits by default.
For example, in Solaris:
# ulimit -a time(seconds) unlimited file(blocks) unlimited data(kbytes) unlimited stack(kbytes) 8192 coredump(blocks) unlimited nofiles(descriptors) 256 vmemory(kbytes) unlimited
If the limit is too small, you may want to increase it using the ulimit
command with the -n
option, then relauch your application.
Check limits of a running process
There are some system calls that allow you to change the current limits of a process while it is running. Therefore the values might be different from the default ones inherited from the shell. To check the current settings for a running process you can investigate the /proc
environment for the process
Under Solaris, we can use the plimit
command. For example:
plimit 28903 28903: ksh -o vi resource current maximum time(seconds) unlimited unlimited file(blocks) unlimited unlimited data(kbytes) unlimited unlimited stack(kbytes) 8192 unlimited coredump(blocks) unlimited unlimited nofiles(descriptors) 256 65536 vmemory(kbytes) unlimited unlimited
Limit of file descriptors will show as 'nofiles(descriptors)'
Tracking possible file descriptor leaks
By regularly checking you would see the number growing on and on in case of a leak. However, keep in mind that the number of files descriptors growing does not always indicate a leak. It might simply be that the process needs to open a lot of files.
There are multiple ways to do this. The easiest is simply to investigate the '/proc' virtual file system to check how many files are opened by the process.
Under Solaris we can use ls /proc/<pid>/fd
and pfiles <pid>
. For example:
# ls -al /proc/29803/fd/ total 19 dr-x------ 2 root root 8208 May 31 10:40 . dr-x--x--x 5 root root 864 May 31 10:40 .. c--------- 1 mc84838 staff 24, 2 May 31 10:41 0 c--------- 1 mc84838 staff 24, 2 May 31 10:41 1 c--------- 1 mc84838 staff 24, 2 May 31 10:41 2
Whilst the ls /proc/<pid>/fd
is fast it may not tell you which files are actually opened. Finding the list of open files with their names for a running process can be done using the pfiles <pid>
. For example:
# pfiles 29803 29803: ksh -o vi Current rlimit: 256 file descriptors 0: S_IFCHR mode:0620 dev:270,0 ino:12582920 uid:84838 gid:10 rdev:24,2 O_RDWR|O_NOCTTY|O_LARGEFILE /devices/pseudo/pts@0:2 1: S_IFCHR mode:0620 dev:270,0 ino:12582920 uid:84838 gid:10 rdev:24,2 O_RDWR|O_NOCTTY|O_LARGEFILE /devices/pseudo/pts@0:2 2: S_IFCHR mode:0620 dev:270,0 ino:12582920 uid:84838 gid:10 rdev:24,2 O_RDWR|O_NOCTTY|O_LARGEFILE /devices/pseudo/pts@0:2
Tracking open files in real time
Tracking file descriptors usage in real time means that you have to monitor both the open()
and close()
system calls. To be more accurate you can use the same method to also track system calls like dup()
and others like socket()
that would create a new file descriptor.
To track in real time the usage of file descriptors you can use a Solaris debugger like dbx
. You can also use system tools like dtrace
or use system traces if available.
The preferred choice would be to use system tools as these are actually executing within kernel thus avoiding delays caused by debuggers.