At 10:29 AM 12/15/2004, Stefan Puiu wrote:
>Hello,
>
>first of all, this is a question regarding libbind code; if this is not
>the right place to ask, please point me to the right list.
>
>we are using libbind 8.2.7 for an app that needs to do DDNS updates
>(however, the code I'm talking about is still there in 8.4.5). Our app
>calls res_nsendsigned(), which in turn calls res_nsend(), for sending
>some updates. However, on Windows, after some stress testing,
>res_nsendsigned() would begin to return error, setting the errno
>(actually, the value returned by WSAGetLastError()) to 10038, which is
>WSAENOTSOCK. From what I could make out of the source, this error code
>is set in lib/resolv/res_send.c from the libbind code, in the send_dg()
>function (this one seems to get called, favouring UDP over TCP as
>expected). In BIND 8.4.5, the code looks like this (line numbers on the
>left):
>
> 770 if (EXT(statp).nssocks[ns] == -1) {
> 771 EXT(statp).nssocks[ns] = socket(nsap->sa_family,
>SOCK_DGRAM, 0);
> 772 if (EXT(statp).nssocks[ns] > highestFD) {
> 773 res_nclose(statp);
> 774 errno = ENOTSOCK;
> 775 }
>
>highestFD is FD_SETSIZE-1, which, on Windows, seems to be equal to
>16383. However, I'm having a hard time understanding why somebody would
>want to limit the range of values returned by socket() that are
>considered "valid" - unless they'd expect socket() to return the first
>available free descriptor. On Windows, if you create 20.000 sockets,
>then close them all, then try creating a new one, the descriptor that is
>returned to you doesn't seem to be the lowest available, but the next
>after the one returned by the last socket() call - we've observed this
>behaviour with a simple test program. So, if at a certain point a socket
>with fd 16383 is created, then most likely all subsequent calls to
>send_dg() will fail, because socket() will return a value too big, even
>though there wouldn't be 16383 descriptors actually open. If FD_SETSIZE
>is supposed to limit the number of open descriptors, this piece of code
>doesn't seem to achieve it; it breaks libbind apps running on Windows
>instead.
>
>What I'm asking is: why is FD_SETSIZE needed? And of how much use is
>this in the aforementioned situation? Is there a way to work around this
>problem (like using some other function in libbind for sending packets)?


This is a much more complicated subject than you may realise. The problems
that you are seeing here in the library on Windows is one of the reasons
that the socket code was completely rewritten for Windows in the BIND 9
native code. The other was related to performance.

The problem is related to the fact that the Windows socket() function returns
a 32-bit unsigned integer which can and does take any value in that range
of numbers. FD_SETSIZE is really only valid for the FDSET for the select()
function and does not relate to the value of the socket fd, it's just the
maximum number of possible sockets that the select() can handle. (I'm
simplifying somewhat here). On Unix systems, fd's are basically created
sequentially. Not so on Windows which could hand toy virtually any number
whatsoever that isn't currently being used and regularly returns a large
number. Even if it doesn't I nornally seeing it start with 1000 and go up from
there. It's not practical to create a 4-gigabyte size array just to allowing
indexing into it by fd and it's a VERY sparse array anyway. On Unix
FD_SETSIZE is used to set up an array where the index is the fd value.
A better strategy is to use a list to hold the info, but then noone's really
wants to implement all of the necessary changes especially large changes.

For Windows the snippet above already shows a problem since the code
should use INVALID_SOCKET instead of -1. Create the Macro for Unix
to be -1 and using INVALID_SOCKET would make this work on all Unix and
Windows platforms, but that's a simple change. On WIndows highestFD would
be 2**32-1 and not be related to FD_SETSIZE.

I don't have time to look at the code and it's been a long time since I touched
the BIND 8 code but I suspect that there's a lot more work that would
need to be done here to get all of the problems fixed.

Danny