1. What version of BIND are you running? We recompiled 9.5.0-P2 with 4096 file descriptors and we no longer saw errors on our Solaris machines. Regardless, 9.5.1b2 has been the best weapon of the 9.5.x train so far (in QPS and server efficiency). Obviously, this is a beta version and that should be considered.

2. In our environment, 9.5.0-P2 had to be recompiled to use additional FDs. That is, increasing "rlim" values had no (upper limit) effect. You must also consider the user environment you are running named as and that environment's FD limits (/etc/system verses "ulimit" settings in a root .profile, for example).
You might want to review the various "rlim" settings for your environment against what the defaults are:

3. If you are running 9.5.x (or are considering moving towards it), but sure to understand the changes from previous versions. For example, max-cache-size is now set to 32M default (much too small for our environment) In addition to the isc.org ARM and README files with the source code, there is this site:

4. I know the UK doesn't celebrate Halloween, but Happy Halloween anyway.

HTH -- Chris

----- Original Message ----
From: Barry Dean
To: bind-users@isc.org
Sent: Friday, October 31, 2008 10:37:24 AM
Subject: file descriptors


A while back I migrated DNS from some old PC servers running NetBSD and
bind 9 to some new shiny Sun X4200's running Solaris 10 and bind 9
(Sun's installed version).

One of the first thigs we noticed on the internal DNS servers that allow
recursion is that the maximum number of recursive clients was being hit
regularly. I upped the value a few times, eventually settling on 4000 as
that seems to have stopped the messages.

On one of these servers I am now seeing a lot of:

socket: too many open file descriptors

.... errors in the messages log! Curious because even if we were at the
limit of 4000 clients, the current limit on file descriptors is ...

dns# plimit 354
354:* * /usr/sbin/named
* resource* * * * * * * current* * * * maximum
* time(seconds)* * * * unlimited* * * unlimited
* file(blocks)* * * * * unlimited* * * unlimited
* data(kbytes)* * * * * unlimited* * * unlimited
* stack(kbytes)* * * * unlimited* * * unlimited
* coredump(blocks)* * * unlimited* * * unlimited
* nofiles(descriptors)* 65536* * * * * 65536
* vmemory(kbytes)* * * unlimited* * * unlimited

65,536! Which would mean each and every recursing query using 16 file

Or is there a different lower limit on sockets? I have not heard of such
a thing?

Have I set the recursing limit higher than the FD_SETSIZE for select
(being 1024 for 32-bit apps on Solaris), can that be the problem?
Doesn't look that way as the perror states socket!

Any ideas welcome!


Barry Dean
Networks Team